Further Algebra 
and 
Applications 
P.M. Cohn. 


“@);) Springer 


Springer 
London 

Berlin 
Heidelberg 
New York 
Hong Kong 
Milan 

Paris 

Tokyo 


P.M. Cohn 


Further Algebra 
and Applications 


With 27 Figures 


Bae 
yy Opringer 


P.M. Cohn, MA, PhD, FRS 
Department of Mathematics, University College London, 
Gower Street, London WCIE 6BT 


British Library Cataloguing in Publication Data 
Cohn, P. M. (Paul Moritz) 
Further algebra and applications 
I. Algebra 2. Algebra ~ Problems, exercises, etc. 
I. Title 
512 
ISBN 1852336676 


Library of Congress Cataloging-in-Publication Data 
Cohn, P.M. (Paul Moritz) 
Further algebra and applications/P.M. Cohn. 
p. cm. 
Rev. ed. of: Algebra. 2nd ed. c1982-c1991. 
Includes bibliographical references and indexes. 
ISBN 1-85233-667-6 (alk. paper) 
} Algebra. I. Cohn, P.M. (Paul Moritz). Algebra. II. Title. 


QA154.3.C64 2003 
512-dc21 2002026862 


Apart from any fair dealing for the purposes of research or private study, or criticism or review, as 
permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, 
stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, 
or in the case of reprographic reproduction in accordance with the terms of licences issued by the 
Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the 
publishers. 


ISBN 1-85233-667-6 Springer-Verlag London Berlin Heidelberg 
a member of BertelsmannSpringer Science+Business Media GmbH 
http://www. springer.co.uk 


© Professor P.M. Cohn 2003 
Printed in Great Britain 


The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a 
specific statement, that such names are exempt from the relevant laws and regulations and therefore free 
for general use. 


The publisher makes no representation, express or implied, with regard to the accuracy of the information 
contained in this book and cannot accept any legal responsibility or liability for any errors or omissions 
that may be made. 


Typesetting by BC Typesetting, Bristol BS31 1NZ 
Printed and bound at The Cromwell Press, Trowbridge, Wiltshire, UK 
12/3830-543210 Printed on acid-free paper SPIN 10883345 


Contents 


Conventions»On Lerminology sac buiccscsesasuscutaacssevenvionteadunceseactsoustustoucpieueyensiactesaants 
PECL AC Castes sStchity So er tecs cssiodn ts sate ear ves Cara cacca bean ulens essen ettc heats etuestaatacesuas we oaks 


1. Universal algebra 


1.1 = Algebras and HOMOMOrPNIsMS....... ee eeeeeseeeeececeaeeeeeeneesesseaeenateaeeeees 
1.2 Congruences and the isomorphism theorems ..........ccesesseeseeeeeeeeees 
lsd .Free-algebras: and varieties tsicsucnsnscn dei aan on eaneigadaiar 
WA SES diam Ond Lema sccesewiicletoteas. onset eectecterenss Game eerie aaaeneeeens 
Lo | Dltraproduicts: ejects sect occ spessscssnssweel atane tartans mane eee 
Foe “Ae Matar al MUM DENS is teen vei cot os wastes hae deetayenecatnceeeecteer cad ste coud teas 


2. Homological algebra 


Zab ~ AECIUVEvatid ae ai <Cavee ON 1 OS acceso sta des case eww ect acastaens cnet ec veueaetadopte des 
2.2: FUMCtOrs Omabelan. Cate sOries ss ctlecsi as testhcses acaicyariehinece Meatiias 
Did; "INE VCALE ROLY IO peiscceeteats axes cnaat tec ede ceauinletasuasaeunates wi dalemarieclaaladusiviotasts 
2A Tomolosical Gimen$i On a siceiwisnies chests itciiviaesiavedse ected ieee renee uacnt 
De DENIC MUM CLOTS ig setae eek ceatatadoscnneh Guede talc an aartcaen washer antes 
26> Ext: Tor and-glopaldimension icteisisestiecisreacivtarncnneasiatiaceseaecesiannien 
2.7. Tensor algebras, universal derivations and SyZYgieS..........sssseeeee: 


3. Further group theory 


Syl.  SCGFOUD EXTEN SIOU Sp asiescc 5st suse giieata sed teereu sivaaerevapecnseapeemma sean: 
Ded FAW SUDO TOUDS iiaceovivectuntaetionaedveciersatvaiaasan ees aaomaendie tease 
Di “ATTIC TATAGIEY 00sec Sceatnca te lsagersctarectiterieenteaemanneiss Gomera encaene 
DA: SPECCOLOUDS ish ose iectiora ses liaconlentaesidedereuentaneanus earasaseuaens eo ada aaezatnaanden 
D,5! MINCAF KOU Ds avs feeiatec aetna ie iol econ aie eieieuees. 
BiB. ~The -SympleCtic Stout piresisstatscacseancocstuessostszeuesesteecaiunsvacece eden seesaseevevdes 
S7- -sINEOTEN OS Ol al OT OUP itera vicccneieescdn ate aticasisipueeesieeteadeoae Meats 
4. Algebras 
41 “Che Krall] Schimidt (herent siccactuiveistecnos.citeinaenotoereeuieane sane 
42.  Ehe*projective cover-ol a module wenccinawictatinawnwitienenssitins 


Bie ~“SETMIP CT Le COT UNS coast ata ese ate Aaa tae tees anes 


S 


102 
107 
109 
116 
121 
126 


135 
139 
142 


Further Algebra and Applications 


4A. “Equivalence of module -CatesOries w:cscsscisseseescntacsassteansstacinelonteniestee 148 
AD “WRG MOrita COMTEX Oe ood titers trasc tei sa ic leid eee ela ence acid anes 155 
4.6 Projective, injective and flat Modules 00.0... eee eseeseceseeeteeseeeneeeeteees 160 
4.7 Hochschild cohomology and separable algebras... 168 


. Central simple algebras 


Sal SOU Ple AAP Nia GTI ES deayrsau csi ecaoagcaasentatei tec piasosay oavestica eetuactees 179 
Beker Eee A WE OL OU Dace tetas gt oases nc nase een rintd te nde incineat hance 187 
B3o>  ESTECUICEC) MOTI AM a Ce seas haae a wale mrcantea tach ane acotaasteamaatoee: 194 
DA OWALERMION a1 CC DIAS cites cetera te iierescottan resins pene atari Medea etacadetas 200 
Bids — GOSSEd PE OCUCIS sacecsenus gasseetsccgestuconsieisoetuisen ss catecenseateaxactaqeaicatenad Sete, 203 
DIG. “NHbange Of DaSe Meld esistemssu ace teael comet pce meeicreicic yn 209 
Dal TRS CIG AI CCAS ratios Sasa at hald ctgeusn cdi audi daneaceumvecueetsteiaecenatadeavaeetst 215 


. Representation theory of finite groups 


Gul “Basic en ii iOns xsiaieses esc ateveencage ean nian cic eee ene 221 
6.2 The averaging lemma and Maschke’s theorem... eeeeeessseeseeerees 226 
6.3  Orthogonality and completeness ..........:csssceceeseseeeseeneereeceeceneeneeeeaeeees 229 
Bid. AGAR A CUOTS ctf cases sau ip taiicauenc vanadate insta Sota ceeae past evido a anaes: 233 
65° Momplex epi ese m tations jcscssckavadiast rates iinesendaeanssiecnmeenseeautesvann sh walesss 241 
6.6 Representations of the symmetric QrOUP...... cee ceseeseeesseeteceeseeeeeeenes 247 
O.7 JINGUCEd TEP resen tall wipeisrsctntescesae cn ceantsceetaamywsiuodiuanlei ters aunts ee 
6.8 Applications: The theorems of Burnside and Frobenius..........00...... 259 


. Noetherian rings and polynomial identities 


Pale ARIES OE: IVACTIONS yepivcsed. t2vssy tien she daeunseleasturntaver ire ti eeegneetadeetsten tins 265 
(2, "Primerpal 1dealdOmalinssceselcieneecalaevse Oe veh caution ce tnateanweesespasateenanene 27) 
7.3. Skew polynomial rings and Laurent SerieS...... ees ceeeseseeeeeeeeeseeeees 274 
fame “MGOMG ICS CME ORCI oe saparcccnsenceiin ate arecasieslyet aaa Ga satcaatiaus geeseuie boaihate ates 282 
Te PIANISTS casecep ae ars c ccc cnedeeccdeesomeinecaidie ode neguaten ie casos sasteancaneos 290 
7.6 Varieties of Pl-algebras and Regev’s theorem)... eee eee reeeeeeeees 295 
7.7 Generic matrix rings and central polynomials ..00.... eee eeeeeeeeee 299 
7.8 Generalized polynomial identities :..scccccccisseseccscasdesiseceareseeciaeniseniccsen 303 


. Rings without finiteness assumptions 


Bil “PRE MeMSilys TAEOLENT TEVISIL OG a sosovatedh tice cis otad eies, tasaeect a eesareest oneal 309 
Oe APEDINIEIV GLUING S sottes Dasa sat eta ceel a5 Soe Nees Nea aaap ha tied uA adele t hicacndeseine dane 31 
8.3. Semiprimitive rings and the Jacobson radical oo... eee eeteeeeeeees 318 
Sed INGE UT) AIS CLAS sacecrsstedere cscs baatducds iuastennend samssacesinedaeecs bes decsavdeteeaetents 322 
Bi Semiprime rings and ilradiCals cyajcess scala pticesdsclaetncarhetsessaamiehs 328 
OG: "Pre Pi Alera vcccicceauseccs lorena eee ete ores eons 334 
Sy SFAFS-arid SOM oni tics ads alee ee Gace analthe ete iastcasiek 337 


. Skew fields 


9.1 E75 015) 1 | ce ge CR ot RE ENE nT ee 343 
9.2 The Dieudonneé determinant..............ccsssscessecscecccccccceccescaccaacsseccesencecscs 346 


Contents Vil 


9.3 PCS CLC Neier shorts tore teat a seceanecieics catolentsaeanue Riess aan: aoteeinh 354 
9.4 Valtiations On SKEW MeldS) discs cdvessventatstas dovseunecsivesaarhee@enadeasoicandeetion: 358 
9.5 Pseudo-linGar @xtensiOnS’ sc sioicieteaneen Sa Les 365 


10. Coding theory 


lO. ~“Thetransmission: Of imforination ssxtserscccsastinueiatiacdnaeaseios 37] 
VOD: “BUG COGS 55 ssc sas eece teen eaectcens eeamueios ne abuses sandsaaaccecedaaunaaetiaoededes 373 
MOS: - EVMCAE COGES ia sewice ct cetesuceacs one sisecccuuecwe dade Micsaneaceactabecweteecaninsaaews niaainend 376 
DOM “AC VENC COC ES pence ins eet caten ios tess wenden ateecot eee pusngee ender: 384 
VOSS> HOVER COC Sis stadt trssiiba te Bestut eitnccn tele ich ne chia ease cteelo ironman 389 


11. Languages and automata 


11.1 Monoids and monoid actions .......cccccsecccessesseseessesnceeseeseeeeeteeceeeeeneees 395 
Il2- shang wages anid er aniniars eases eh enreenceredeemmenarenes aera: 399 
DUO. GUTOR AEA seers 5cchicce pe ch cacsicelsuwseatad iatdusdbrachs sundodesenadaetaae wi aisues aereaucesetertens 403 
Wea “Wariable-leneth cod ei icscaecics etc ctr theas hacias sca spadeeteueectete ose eaesenans 41] 
11.5 Free algebras and formal power series rings ..........::cs:ccesseceeceeeeteeeeees 419 
BSL OTA Vga sees emcee tees tol ate eaten cet nd sue ane uae sete Acetate aeoeneean este 43] 
DiS Of NOLAR ONS car iesse see ccascy st tihacs ot canactidcirenaynity Scienstlegaen to teuntu ane iesnsaannnedeacnandatansertidss 437 
PUMIOT LN CLOX 5s secastysasa se cges te desieesaesyiomsaiee avon tecie dee npaceadas warampes nceaneee act aatatm ness 44] 


SUD) Ct NN jeg ecces sonst tos areie ed eeaneisiect a easeletacs tact saciaeeareacermantaneecomeistietaa inion 445 


Preface 


This volume follows on the subject matter treated in Basic Algebra and together with 
that volume represents the contents of volumes 2 and 3 of my book on algebra, now 
out of print; the topics have been rearranged a little, with most of the applications in 
the present volume, while the basic theories (groups, rings, fields) are pursued 
further in the earlier book. In any case all parts of volumes 2 and 3 are represented. 
The whole text has been revised, some exercises have been added and of course errors 
have been corrected; | am grateful to a number of readers for bringing such errors to 
my attention. 

Chapter 1 presents the basic notions of universal algebra: the isomorphism 
theorems, free algebras and varieties, with the natural numbers, viewed as algebra 
with a unary operator as an application, as well as the ultraproduct theorem and 
the diamond lemma. The introduction to homological algebra in Chapter 2 goes 
as far as derived functors and global dimension, with the case of polynomial rings 
and free algebras as an application. Chapter 3, on group theory, discusses some 
items of general interest and importance (group extensions, Hall subgroups, trans- 
fer), but also topics which find an echo elsewhere in the book, such as free groups 
and linear groups. Chapter 4, on algebras, deals with the Krull-Schmidt theorem, 
projective covers, Morita equivalence and related matters, but stops short of the 
representation theory of algebras, which would have required more space than was 
available. This is followed by an account of central simple algebras (Chapter 5), 
introducing the Brauer group and crossed products. The representation theory of 
finite groups in Chapter 6 presents the standard facts on representations and 
characters and illustrates this work by the symmetric group. The next two chapters 
return to rings; Chapter 7 presents topics on Noetherian rings such as Goldie’s 
theory, as well as polynomial identities and central polynomials, while Chapter 8 
deals with the general density theorem, the various radicals and non-unital algebras. 
Chapter 9, on skew fields, gives a simplified treatment of the Dieudonné determinant 
and establishes the existence of “free fields’. Its proof is based on the specialization 
lemma, which is of independent interest. 

The final two chapters are applications of a different kind. Chapter 10 is an intro- 
duction to block codes, in particular linear codes and cyclic codes, as well as some 
other kinds. Chapter 11 deals with algebraic language theory and the related 
topics of variable-length codes, automata and power series rings. In both chapters 
it is only possible to take the first steps in the subject, but we go far enough to 
show how techniques from coding theory are used in the study of free algebras. 


x Further Algebra and Applications 


The text assumes an acquaintance with much of Basic Algebra, to which reference 
is made in the form ‘BA’ followed by the section number. Definitions and key 
properties are usually recalled in some detail, but not necessarily on their first occur- 
rence; the reader can easily trace explanations through the index. As before, there are 
occasional historical references and numerous exercises, often with hints, though no 
solutions. 

A number of colleagues and friends have made comments on the earlier edition 
and I would like to express my thanks to them here. My thanks also go to the 
staff of Springer-Verlag London and to Mrs Lyn Imeson for the efficient way they 
have carried out their task. 

University College London P.M. Cohn 
October 2002 


Conventions on 
Terminology and notes to 
the reader 


References to Basic Algebra are in the form BA, followed by the section number. 

A property is said to hold for almost all members of a set if it holds for all but 
a finite number. The complement of a subset Y in a set X is written X\Y. As a 
rule mappings are written on the right; in particular this is done when mappings 
have to be composed, so that wf6 means: first a, then f. If @ is a mapping from a 
set X and Y is a subset of X, then the restriction of a to Y is written a|Y. 

All rings and monoids have a unit element or one, which acts as neutral element 
for multiplication, usually denoted by 1; by contrast an algebra (over a coefficient 
ring) need not have a one. A ring is trivial or the zero ring if it consists of 0 
alone; this happens just when 1 = 0. An element a of a ring is called a zero-divisor 
ifa~0 and ab = 0 or ba = 0 for some b $ 0; if a is neither 0 nor a zero-divisor, 
it is said to be regular (see Section 7.1). A non-trivial ring without zero-divisors is 
called an integral domain; this term is not taken to imply commutativity. A ring 
in which the non-zero elements form a group under multiplication is called a skew 
field; in the commutative case this reduces to a field, but sometimes (in Chapter 9) 
this term is also used in the general case. In any ring R, the set of all non-zero 
elements is denoted by R*; this notation is mainly used for integral domains, 
where R* is a monoid. A skew field finite-dimensional over its centre is called a divi- 
sion algebra, but the term ‘algebra’ by itself is not taken to imply finite dimension- 
ality. A ring is said to have invariant basis number (IBN) if any two bases of a free 
module have the same number of elements, or equivalently, if any matrix with a 
two-sided inverse is square (see BA, Section 4.6). 

References to the bibliography are by name of author and date in round brackets 
for books and square brackets for papers. As in BA, all results in a section are 
numbered consecutively; further we abbreviate ‘if and only if’ by iff (except in 
enunciations) and use [J to indicate the end (or absence) of a proof. 

The chapters are to a large extent independent, so no interdependence chart has 
been given, but the reader may have to turn back for the occasional result; this is 
usually clearly indicated. 


Universal algebra 


Most algebraic systems such as groups, vector spaces, rings, lattices etc. can be 
regarded from a common point of view as sets with operations defined on them, sub- 
ject to certain laws. This is done in Section 1.1 and it allows many basic results, such 
as the isomorphism theorems, to be stated and proved quite generally, as we shall see 
in Section 1.2. Of the general theory of universal algebra (by now quite extensive), we 
shall need very little, this forms the subject of Section 1.3; in addition to the basic 
concepts we define the notion of an algebraic variety, i.e. a class of algebraic systems 
defined by identical relations, or laws. But there are one or two other topics, not 
strictly part of the subject that are needed: the diamond lemma forms the subject 
of Section 1.4, while dependence relations have already been discussed in BA 
(Section 11.1). There is also the ultraproduct theorem in Section 1.5, a result from 
logic with many uses (see Chapter 7). The chapter ends in Section 1.6 with an axio- 
matic development of the natural numbers, regarded as an algebraic system, in an 
account following Leon Henkin [1960] (see also Cohn (1981)). 


1.1 Algebras and homomorphisms 


Algebraic structures show certain common features: they have operations defined on 
them, which satisfy laws such as the associative law. These operations are mostly 
binary, like addition or multiplication, but sometimes they are unary, e.g. taking 
the inverse of a number, or ternary, e.g. the basic operation in a ternary ring, occur- 
ring in the study of projective planes (see M. Hall (1959)), or even noughtary, like 
the neutral element in a group. For any integer n > 0 we define an n-ary operation 
on a set S to be a mapping of S” into S. The number n is called the arity of the 
operation and we say unary for l-ary, binary for 2-ary, ternary for 3-ary and 
finitary to mean n-ary for some natural number n. A 0-ary operation on S is just 
a particular element of S; this is also called a constant operation. 

An algebra is to be thought of as a set with certain finitary operations defined on it, 
but in order to compare different algebras we need to establish a correspondence 
between their sets of operations. This is done by indexing the operations in each 
algebra by a given index set, which is kept fixed in any discussion. Its elements are 
called operators, each with a given arity. 


2 Universal algebra 


Thus by an operator domain we understand a set 82 and a mapping a: 82 > Np. 
The elements of Q are called operators; if a@ € 2, then a(w) € No is called the 
arity of w. We shall write Q(#) = {w € Qla(w) = n}, and refer to the members of 
§2(m) as n-ary operators. 

An $2-algebra is defined as a pair (A, 92) consisting of a set A together with a family 
of operations indexed by Q: 


w:A"—> A foreach w € Q(n).n=0.1,2..... (1.1.1) 


The set A is called the carrier of the algebra. Strictly speaking we should denote the 
algebra by (A, 22. y), where g is the family of mappings g, : 92(m) — Map(A". A) 
defined by (1.1.1), but usually we shall not distinguish notationally between an 
algebra and its carrier. The set {2 is called the operator domain, or also the signature 
of the algebra. We give some examples. 


1. Groups. A group (G,-,~', 1) is given by a set with a binary operation (multipli- 
cation), a unary operation (inversion) and a constant operation (the neutral ele- 
ment), satisfying certain laws which are familiar to the reader (see Section 1.3 
below). 

2. Rings. A ring (R, +, —. x, 0,1) is given by set with two binary operations +, x, 
two constant operations 0.1 and a unary operation —, again satisfying well- 
known laws. 

3. Lattices. A lattice may be defined as a partially ordered set in which each pair of 
elements has a supremum and an infimum, or as an algebra (L. V, A) with two 
binary operations satisfying certain laws (see BA, Section 3.1). For Boolean 
algebras we require in addition a constant operation 0 and a unary operation ’, 
which leads to another constant operation | = 0’, an instance of a derived 
operation (see Section 1.3). 

4. Vector spaces. Let k be a field. A vector space over k is an algebra (V. +. 0, k) with 
a binary operation +, a constant operation 0 and a family of unary operations 
indexed by k: wy : ut au(u € V.a@ € k), satisfying the laws familiar from linear 
algebra. For an infinite field k this is an example of an algebra with an infinite 
signature. 

5. A l-element set has a unique (2-algebra structure for any Q. This is called the 
trivial Q2-algebra. 

6. The empty set is an Q-algebra precisely when & has no constant operators. 


Given an {2-algebra A and w € Q2(n), we can apply w to any n-tuple a,..... a, EA 
and obtain another element of A which is written a; ...a,,j. In the case 1 = 0 this 
Just singles out an element of A, denoted by w; the zero element in a ring is an 
example. 

Many algebraic concepts can be formulated for general Q2-algebras. Thus given an 
§2-algebra A, an 82-subalgebra is an Q2-algebra B whose carrier is a subset of that of A 
and which is closed under the operations of 92, as defined in A. It is clear from the 
definition that a given subset of A can be defined as a subalgebra of A in at most one 
way. To give an example, the ring Z of integers has no proper subrings, because the 


1.1 Algebras and homomorphisms 3 


constant operation | already generates the whole of Z. The subset {0} is again a ring, 
but it is not a subring because the operation 1 has different values on Z and on {0}. 

It is not hard to see that the intersection of any family of subalgebras of a given 
algebra A is again a subalgebra of A. Hence for any subset X of A we can form 
the intersection of all subalgebras containing X. This is called the subalgebra of A 
generated by X; it may also be obtained by applying the operations of 2 to the 
elements of X and repeating this operation a finite number of times. If the subalgebra 
generated by X is the whole of A, then X is called a generating set of A. Clearly every 
algebra A has a generating set, e.g. A itself. 

A mapping f : A — B between Q-algebras A, B is said to be compatible with 
w € 92(n) if for all a),...,a, € A, 


(a; f) re (a, fa = (a) rds a,w)f. (1.1.2) 


If fis compatible with each w € Q, it is called a homomorphism from A to B. If a 
homomorphism from A to B has an inverse which is again a homomorphism, it 
is called an isomorphism, and A, B are then said to be isomorphic. For example, all 
l-element {2-algebras are isomorphic. As in more special cases, an isomorphism of 
an algebra with itself is called an automorphism and a homomorphism of an algebra 
into itself is an endomorphism. 

We observe that a homomorphism is determined once it is known on a generating 
set. This is the content of 


Proposition 1.1.1. Let f, g : A — B be two homomorphisms between 92-algebras A and 
B. If f and g agree on a generating set of A, then they are equal. 


Proof. The set {x € A|xf = xg} is easily seen to be a subalgebra of A. By hypothesis 
it contains a generating set of A, hence it is the whole of A and so f = g, as we had to 


show. o 


From any family (A,),-,; of 2-algebras we can form the direct product P = | | Ai; 
its carrier is the Cartesian product of the A,, and the operations are carried out 
componentwise. Thus if 2; : P —> A, are the projections from the Cartesian product 
to the factors, then any w € §2(7) is defined on P by the equation 


(a, ...4,@)1; = (a, Tj)... (ay; )o. (1.1.3) 


It is easily checked that this defines an 82-algebra structure on P and the form of 
(1.1.3) shows that the projection 2; is a homomorphism from P to A,. 

Of course the A, need not be all distinct. If for example A; = A for all i € I, we 
obtain the direct power of A indexed by I, which is denoted by A’. Its members 
may be regarded as functions f : J — A and the operations are defined component- 
wise; e.g. if an addition is defined on A, then in A! we have 


(f+ei)=fli)+e7). tel. 


4 Universal algebra 


Exercises 


1. Show that the set of all subalgebras of an (2-algebra is a complete lattice (i.e. a 
lattice in which every subset has a sup and an inf, see BA, Section 3.1). 

. Verify the equivalence of the two definitions of subalgebra generated by X, given 
in the text, i.e. show that the set obtained from X by repeatedly applying &2 is the 
least subalgebra containing X. 

3. Show that if Q is finite, then there are only finitely many {2-algebras on a given 

finite set as carrier. Is there a bound in terms of the size of the carrier alone? 

4, Show that every homomorphism which is bijective is an isomorphism. 

5. Let A be an 92-algebra with a carrier of n elements. Show that A has at most n! 
automorphisms and at most n” endomorphisms. Find bounds on the number 
of automorphisms and endomorphisms if {2 includes a constant operator. Find 
bounds if A has an r-element generating set. 

6. Let A be an {2-algebra. Show that the set Map(A) of all mappings of A into itself 
may be regarded as an {2-algebra. Further show that End(A), the set of all endo- 
morphisms, is a subalgebra of Map(A) provided the following condition is satis- 
fied by A: Given 6 € Q(1m), @ € Q(n) and any m x n matrix over A, the element 
obtained by applying 6 to each column and w to the result is the same as the 
element obtained by applying w to each row and @ to the result. 


to 


1.2 Congruences and the isomorphism theorems 


Let A and B be any sets. By a correspondence from A to B we understand a subset of 
the Cartesian product A x B. For example, a mapping f : A — B may be defined as a 
correspondence [Ty from A to B which has the properties of being (i) everywhere 
defined and (ii) single-valued: 


(i) for each aeéA there exists b € B such that (a, b) € Ty, 
(ii) if (a,b). (a,b’) € T;, then b= bd’. 


This correspondence is sometimes called the graph of the mapping f- 
We shall define two operations on correspondences. For any [ C A x B we have 
the inverse, defined as 


ro! ={(b,a)€ Bx Al(a.b) eT); 


next, if [TC A x Band A CB x C, then their composition is given by 
Tc A= {(a,c)e€A x Cl(a,x) € A x Band (x.c) € Bx C for some x € B}. 


Further, if [ C A x Band A’ C A, wedefine A’T = {b € Bi(a, b) € T forsomea € A}. 

On every set we have the identity correspondence, also called the diagonal, 
1, = {(a,a)|a € A} and the universal correspondence A’ = {(x, y)\x,y € A}. For 
example, the above conditions (i), (11) on [; to be the graph of a mapping can be 
expressed as follows: | 


Eso hy ola Py iol; < Ip. 


1.2 Congruences and the isomorphism theorems 5 


To give another example, an equivalence on A may be defined as a subset I" of A“ 
with the properties 


Gi) Tol CT (transitivity) 
(ii) T°! = (symmetry) 
(iii) T D1, (reflexivity). 


To use correspondences in the study of {2-algebras, we shall need to know their 
behaviour as subalgebras. 


Lemma 1.2.1. Let A, B, C be Q-algebras andl. A subalgebras of A x B, B x C respec- 
tively. Then [~' is a subalgebra of B x A, To A is a subalgebra of A x C and for any 
subalgebra A’ of A, A'T is a subalgebra of B. 


Proof. Take w € Q2() and (a@;,c;) e Fo A(i=1,..., 1), say (a;, b,) € T, (bj, ¢;) € A, 
and put a, ...a,@ = a,b, ...b,w = b,c, ...c,@ = c. The since ’, A are subalgebras, 
we have (a. b) € T, (b,c) € A, hence (a, c) € Tf o A and since this holds for all w € Q, 
it shows I’ o A to bea subalgebra. The proof for [~ ' and A'T is quite similar and may 


be left to the reader. Ps | 


Let S, T be any sets and f : S > T a mapping between them. Then the image of fis 
defined as ST;, also written im f or Sf; the kernel of fis defined as the correspondence 


ker f = {(x, y") € S~|xf = yf}. (1.2.1) 
In terms of the graph I"; of f we have 
kerf = Tyo ry. 


Clearly it is an equivalence on S; the different equivalence classes are just the inverse 
images of elements in the image, sometimes called the fibres of f. 

Let us consider how the above definition is related to the kernel of a homo- 
morphism of groups. Given a group homomorphism f : G — H, the kernel of f 
in the usual sense is the inverse image under f of the unit element of H; this is a 
normal subgroup N of G and the different cosets of N in G are just the fibres of f. 
So ker f as defined in (1.2.1) is the set of cosets of N in G. Since this collection is 
entirely determined by N, it makes sense to replace it by N, which is what is usually 
done in group theory. But for arbitrary sets we shall need the whole correspondence 
ker f as defined above and we cannot replace it by anything simpler. This is still true 
when we come to study kernels of homomorphisms of {2-algebras. 

Let S, T be any sets and I’ a correspondence from S to T. We shall use I to define 
systems of subsets of S$, T between which there is an inclusion-reversing bijection, as 
follows. 

For any subset X of S we define a subset X* of T by 


X* ={yeT|(x.y) eT forall xe xX}. 
and similarly, for any subset Y of T we define a subset Y* of S by 
Y* = {xe S|(x,y) eT forall ye Y}. 


6 Universal algebra 


We thus have mappings 
X>X*, Yry* (1.2.2) 


of A(X), A(Y) into each other with the properties 


XE%rSX ON, YipCwVSy Sy:. (1.2.3) 
xox. Yay (1.2.4) 
Leak, os. (1.2.5) 


Conditions (1.2.3) and (1.2.4) are immediate from the definitions. If (1.2.3) is 
applied to (1.2.4), we get X* > X*** and (1.2.4) applied with X* in place of X 
gives X* C X***. Hence X*** = X* and similarly for Y*. This proves (1.2.5) as a 
consequence of (1.2.3) and (1.2.4) alone. 

A pair of mappings (1.2.2) between A(S) and A(T) satisfying (1.2.3), (1.2.4) and 
hence (1.2.5) is called a Galois connexion. An obvious example, which also accounts 
for the name, is the situation in field theory. If F is a field and G the group of all its 
automorphisms, then the pairs (x, @) € F x G such that x* = x form a correspon- 
dence which establishes a Galois connexion between certain subfields of F and 
subgroups of G. If G is a finite group of automorphisms of F and k is the subfield 
of elements left fixed by G, then there is a correspondence between all subgroups 
of G and all fields between k and F (see BA, Section 7.6 and Section 11.8). 

Let us define a congruence on an Q-algebra A as an equivalence on A° which is also 
a subalgebra of A~. For example, 1, and A* are congruences on A, and every other 
congruence q lies between these two: 14 C q C A~. The congruence on Z (as ring) 
determined by a given positive integer m consists of the residue classes mod m, 
i.e. the sets of numbers leaving a given remainder after division by m. As in this 
example, we shall sometimes, for any congruence q on A, write a = b (mod q) to 
mean (a, b) € q. 

The next two results explain the significance of congruences for algebras. 


Theorem 1.2.2. Let f : A — B be a homomorphism of Q-algebras. Then im f is a 
subalgebra of B and ker f is a congruence on A. 


Proof. It is easily checked that the graph I; of fis a subalgebra of A x B. By Lemma 
1.2.1, im f = Ay is a subalgebra of B and ker f = Ty oI, ' is a subalgebra of A’, 
therefore it is a congruence. | 


Given a group G with a normal subgroup N, we can put a group structure on the 
set G/N such that the natural mapping G — G/N is a homomorphism. In the same 
way we can, for any congruence q on an 92-algebra A, define an algebra structure on 
the set of q-classes A/q such that the natural mapping A ~ A/q is a homomorphism 
with kernel q. This is the content of 


1.2 Congruences and the isomorphism theorems 7 


Theorem 1.2.3. Let A be an Q-algebra and q a congruence on A. Then there exists a 
unique Q2-algebra structure on the set A/q of all q-classes such that the natural mapping 
v:A— A/q is a homomorphism. 


Proof. The natural mapping v : A — A/q is well-defined because q is an equivalence. 
It induces the mapping v, : A" — (A/q)" for n = 0, 1,...in an obvious fashion. Let 
us write [x] for the residue class (mod q) of x € A. To complete the proof we must 
show that for each w € Q(n) there is just one way to complete the diagram 


A" ——» (A/q)" 
| a) a 
A A/q 


to a commutative square; thus we have to find a map w’ : (A/q)" —> A/q such that 
[a, | ngs a, |@ = {ay 1.2. A,w). (1.2.6) 


This equation defines w’ uniquely if we can show that the right-hand side is indepen- 
dent of the choice of a, in its q-class. Let (a;, a.) € q; since q is a subalgebra, we have 
(a) ...€,@,a,...a,@) € q, Le. 


[a,...a,@| tl Oi i 


and this is what we had to show. P| 


The algebra so defined on A/q is again denoted by A/q and is called the quotient 
algebra of A by q, with the natural homomorphism v: A > A/q. For example, as 
we have seen, A always has the congruences 1,.A~; the corresponding quotients 
are A and the trivial 8&2-algebra consisting of a single element. An Q2-algebra is said 
to be simple if it is non-trivial and has no quotients other than itself and the trivial 
algebra. It follows that an algebra A is simple iff it is non-trivial and has no con- 
gruences other than 1, or A>. 

The isomorphism theorems for groups have precise analogues for $2-algebras. We 
begin with the factor theorem, which is also familiar in the case of groups (BA, The- 
orem 2.3.1). 


Theorem 1.2.4 (Factor theorem). Let f : A — B be a homomorphism of §2-algebras 
and q a congruence on A such that q C ker f. Then there is a unique homomorphism 
f':A/q— B such that f = vf’, where v is the natural homomorphism from A to 
A/q. Further, f' is injective if and only if q = ker f. 


Proof. Let us again write [x] for the q-class of x € A. If a homomorphism f° with the 
stated properties exists, then it must satisfy 


lalf'=af, aeéA. (1.2.7) 


Thus there can be at most one such mapping. To show that there is one we have to 
verify that the right-hand side of (1.2.7) depends only on the q-class containing a and 


8 Universal algebra 


not on a itself. Let (a, a’) € q; then (a, a’) € ker f, hence af = af as claimed. Thus 
there is a unique well-defined mapping f' to satisfy (1.2.7) and it only remains to 
show that f’ is a homomorphism. Given a; € A, @ € Q(n), we have (a) ...a,w)f = 
(aif)... (anf)w, hence by (1.2.7), 


[ay ..a,w) fo = [a if’ oo [an] f'@, 


and this shows f' to be a homomorphism. It is injective iff no two distinct q-classes 
are identified by f’ and this is just the condition q = ker f. a 


Theorem 1.2.5 (First isomorphism theorem). Let f : A — B be a homomorphism of 
02-algebras. Then 


A/ker f = im f. (1.2.8) 


Thus f may be factorized as f = vf\u, where v: A — A/ker f is the natural homo- 
morphism, f, 1s the isomorphism (1.2.8) and w:imf — B is the inclusion mapping. 


Proof. By applying the factor theorem with q = ker f we find f’ : A/ker f — B such 
that f = vf’, where f’ is injective. Its image is im f, so there is an isomorphism 
fi : A/ker f — imf to satisfy f = vf\y, as claimed. a 


Theorem 1.2.6 (Second isomorphism theorem). Let A be an Q-algebra, A, a sub- 
algebra of A and q a congruence on A. Then the union of all q-classes meeting A, is 
a subalgebra A} of A, q, = qM Aj is a congruence on A, and we have an isomorphism 


A\/q, & Al/q. (1.2.9) 


Proof. Let v : A > A/q be the natural homomorphism and v, its restriction to A). 
Then v; is a homomorphism of A, into A/q; its image is the set of q-classes meeting 
A), namely Aj/q, and its kernel is qM. A? = q,. Applying Theorem 1.2.5 we obtain 
(1.2.9). ee 


Similarly, by applying the factor theorem with B = A/t and the natural homo- 
morphism v,:A— A/t for f, we obtain 


Theorem 1.2.7 (Third isomorphism theorem). Let A be an Q-algebra and q, t con- 
gruences on A such that q C vt. Then there is a unique homomorphism 0: A/q — A/t 
such that v4@ = v,. Further, ker 6 is the set of pairs of q-classes that are identified in A/c. 
Denoting this set by t/q, we find that t/q is a congruence on A/q and @ induces an 
isomorphism 


0’: (A/q)/(t/q) > A/t, (1.2.10) 
such that 0 = vy, . + | 


If we fix q and vary t, we obtain 


1.2 Congruences and the isomorphism theorems 9 


Corollary 1.2.8. Let A be an 82-algebra and q a congruence on A. There ts a natural 
bijection between the set of congruences on A/q and the set of congruences on A which 
contain a, and if t/q, t correspond in this way, then 


A/t & (A/q)/(t/q). ag 


In particular, we see that A/q is simple iff q is a maximal proper congruence on A. 
We note the standard application of Zorn’s lemma to obtain maximal subalgebras: 


Theorem 1.2.9. Let A be an {2-algebra, A a subalgebra and S a subset of A. Then there 
exists a subalgebra C of A which is maximal subject to the conditions C DA’, 
COAS=A 1:8. 


Proof. The family of all subalgebras C such that C D A’, CN S = A’ NS is easily seen 
to be inductive; hence by Zorn’s lemma there is a subalgebra which is maximal 
subject to these conditions. | | 


Since congruences on A are certainly subalgebras of A’ and the collection of all 
congruences is inductive, we obtain 


Corollary 1.2.10. Let A be an 82-algebra, T a correspondence on A and q a congruence 
on A. Then there exists a congruence q* on A which is maximal subject to the conditions 
qe lg.qgNr=qnol. + | 


We conclude this section with a construction which is often used, the subdirect 
product. Let us again take a direct product of 2-algebras: P = []A,, with projections 
mx, : P — A,. It is easily seen that P may be characterized by the properties: 


(i) for any x,y € P, if xm; = yn; for alli, thenx =y, 
(ii) given any family (a;), where a; € Aj, there exists x € P such that xx, = a,. 


Often one encounters situations where only (i) holds. This means that we are deal- 
ing essentially with a certain subalgebra of the direct product [] Aj;, with projections 
m, : P —> Aj. An algebra A is called a subdirect product of the A; if there is an embed- 
ding of A in the direct product P such that the image is mapped by 7; onto Aj, for 
all i. We remark that any subalgebra A of | | A; is a subdirect product of the family Aj, 
where A‘ is the image of the restriction map 7;|A. Subdirect products usually arise as 
follows. 


Proposition 1.2.11. Let A be an &2-algebra and (q;) a family of congruences on A. Put 
q=Mq, A, = A/q;. Then A/q ts a subdirect product of the family (Aj). 
Proof. The map 6: A > || A; defined by 

a6 = (a,), where gq; is the q,-class of a, (1.2.11) 


is a homomorphism and its kernel is clearly Nq; = q. Dividing by q, we obtain an 
embedding A/q — || A; and by (1.2.11), Oz; is surjective, hence A/q is a subdirect 
product of the Aj. = | 


10 Universal algebra 


Of course if the congruences q; intersect in 14, then A itself is a subdirect product 
of the Aj. 

This proposition may also be expressed as follows. A congruence g on A is said to 
separate a,b € A if (a,b) ¢ gq. Given a class FY of Q-algebras, an algebra A is said to 
be residually-? if for each pair of distinct elements a, b of A there is a congruence g 
separating a and b such that A/q € ¥Y. Now Proposition 1.2.11 tells us that an algebra 
is a subdirect product of ?-algebras iff it is residually-7. 

A family (q;) of congruences on A is said to be separating if Nq; = 1. This just 
means that any pair of distinct elements is separated by some q;. It is clear that 
any family which includes 1, is separating. If the set of all congruences 4 1, on A 
is separating, A is called subdirectly reducible; otherwise A is subdirectly irreducible. 
We note that the trivial algebra is subdirectly reducible according to this definition. 
Let us also remark that A is subdirectly irreducible iff in every subdirect product 
representation of A at least one of the homomorphisms to the factors is an iso- 
morphism. With this definition we have the following theorem, due to Garrett 
Birkhoff, 1944. 


Theorem 1.2.12. Every Q-algebra is a subdirect product of subdirectly irreducible Q- 
algebras which are homomorphic images of A. 


Proof. Since the trivial algebra may be written as the empty product, we may take A 
to be non-trivial. Ifa. b € A,a # b, then by Corollary 1.2.10, there is a maximal con- 
gruence q, not containing (a,b). Thus (a,b) ¢q, but (a,b) € q’ for all q’ D qy; 
hence A/q, is subdirectly irreducible. Moreover, if (q;) is the family of congruences 
formed for all such pairs a. b, then Mq, = 1 because any pair a + b is separated 
by some q,. Thus by Proposition 1.2.11, A is a subdirect product of subdirectly 
irreducible algebras A/q,, each a homomorphic image of A. + | 


Exercises 


1. Write down the conditions for a correspondence from sets S to T to be a bijection. 

2. Describe a partial ordering on a set S in terms of the correspondence 
{(a,b) € S*la < bh. 

3. Fill in the details in the proof of Lemma 1.2.1. 

4. Verify that the kernel of a ring homomorphism (in the sense defined in the text) is 
the equivalence whose classes are the cosets of an ideal. Consider the isomorphism 
theorems of the text in the case of rings. 

5. Verify that sets (without structure) can be regarded as the special case of Q- 
algebras with Q = @. Interpret the factor theorem and the isomorphism theorems 
for sets in this way. 

6. Show that Z as a ring is a subdirect product of the fields F,, where p runs over all 
primes. Do the same for F, where q = p” for a fixed prime and n= 1, 2,.... 


1.3 Free algebras and varieties 11 


1.3 Free algebras and varieties 


In order to study {2-algebras we shall need to form expressions in indeterminates, 
just as polynomials in one or more indeterminates are used to study rings. Let 
X = {x),%2....} be any set, our alphabet, usually taken to be countably infinite, 
and &2 any operator domain. We define an Q-algebra W(2; X), the algebra of all 
Q2-rows in X, as follows: An Q-row in X is a finite sequence of elements in the set 
(2U X (where X is assumed disjoint from &2). The action of Q is by juxtaposition; 
thus if wm € Q(m) and w,..., 4, € W(; X), then the effect of w on the n-tuple 
(u)....,4%,) is the row 


Uy Uo... Uy. 


Clearly X is a subset of W({2; X); the subalgebra generated by X is called the 82-word 
algebra on X and is denoted by Wo(X). Its elements are 82-words in the alphabet X. 
For example, if there is one binary operation a, then x)x.x3ax,@q@ is an §2-word, 
while x;@@x.@x3 1s an Q-row which is not an Q-word. 

We shall need a simple test for finding which &2-rows are words. For this purpose 
we associate two integers with each Q2-row. The length of w € W(Q: X), written |wI, 
is the number of terms in w; thus if w = c;...cx, where c; € QU X, then |w| = N. 
Secondly we define the valency of w as v(w) = ©. v(c;), where 


] if¢-eX, 


v(c;) =| 
l—n if c¢, € Q(n). 


Intuitively the valency represents the element-balance: thus if w € (2(n), then w 
requires an input of mn elements and has an output of one element, so that 
v({2) = output — input. This idea is exploited in the following result (due to Karl 
Schréter and independently, Philip Hall), which provides a criterion for an 82-row 
to be a word, using the notion of a prefix, i.e. a left-hand factor: 


Proposition 1.3.1. An Q-row w=c)...cy, in X is an Q-word if and only if every 
prefix w; = ¢,...¢, of w satisfies 
Vw =O tr Lida). (1.3.1) 
and 
v(w) = 1. (1.3.2) 
Moreover, each word can be obtained in just one way from its constituents. 


Proof. We shall show more generally, by induction on the length |w|, that w is a 
sequence of r words if (1.3.1) holds and 


v(w) =r. (1.3.3) 


This includes the assertion of the theorem for r= 1. When |w| = 1, (1.3.1) implies 
that v(w) = 1, so w € X U (0), and conversely, so the result holds in this case; we 
may therefore take |w| > 1. 


12 Universal algebra 


Suppose first that w is an Q-word, say w= u4,...U,@, where uj; € We(X) and 
w € Q(n). By the induction hypothesis, v(uj)=1 and v(w)=1—n, so 
viw) =n+1—n=1. Moreover, every prefix of each u; has positive valency, 
hence the same is true of w. When w is a sequence of r words, (1.3.1) again holds 
and (1.3.3) follows by addition. 

Conversely, let w be an Q-row satisfying (1.3.1) and |w| > 1, vw) =r > 0. We 
write w= wc, where c€ QUX and vw’) =r’ > 0 by (1.3.1). By induction on 
the length, w’ is then a sequence of r’ Q-words. Now either c € X U Q2(0), and 
then w is a sequence of r'+1 words and v(w)=r'+1; or ¢ € Q(n), where 
n>0O, and then, since v(w)=r>0, we have r'+l—n=r->0, hence 
n-==r'+1-—randc is applied to the last n words of w’ to produce a single word, 
so w is a sequence of r’ ~ (n — 1) =r Q-words, as we had to show. This analysis 
of w also shows that it is built up from its constituents in just one way. o 


The uniqueness statement in Proposition 1.3.1 means that it is never necessary 
to insert brackets, because each expression is defined unambiguously. To give an 
example, let + be a binary operation. Then the associative law may be written 


Nixes HA SX Ke 
If A is a second binary operation, then the familiar distributive laws take the form 
x) Xo + x3X = X)X3A%2K3A4+, X1X2X3 + A= X{ X2AX)X3A a ie 


It is essential to write the operation symbols on one side of the variables, say on the 
right, as has been done here. Equivalently the operation symbols can all be written on 
the left (the Lukasiewicz prefix notation). But with the usual infix notation x, + x2 
an ambiguity arises as soon as we form x, + x2 + x3. 

Let A be an Q-algebra. If in an element w of W = We(X) we replace each element 
of X by an element of A we obtain a unique element of A. For |w| = 1 this is clear, so 
assume that |w| > 1 and use induction. We have w = 1; ...u,@(u; € W,w € Q(n)), 
where the u; are uniquely determined once w is given, by Proposition 1.3.1. By 
induction each u; becomes some a, € A when we replace the elements of X by 
elements of A; hence w becomes a, ...a,@, another element of A. This remark 
can be used to establish the universal property of the Q2-word algebra. 


Theorem 1.3.2. Let A be an Q-algebra and X a set. Then every mapping 0: X— A 
extends in just one way to a homomorphism &* : Wo(X) — A. 


Proof. Every &2-word is of the form w=c,...cx, where c; € QUX. We write 
w6* =) ...Cx, where 


' rom: 


; |: ice x. 
cO if cE X. 


Thus w6* is just the unique element of A obtained by replacing each x € X by x6. 
The remark preceding the theorem shows that @* is well-defined, and it is easily 
seen to be a homomorphism extending 6, which is unique by Proposition 1.1.1. 


1.3 Free algebras and varieties 13 


The content of this theorem is also expressed by saying that W(X) is the free 
Q-algebra on X as free generating set. Soon we shall meet free algebras in varieties 
of algebras; the free groups encountered in BA Section 3.3, free modules of BA 
Section 4.6 and the free associative algebras of BA Section 5.1 are examples. 

Given any {2-algebra A, we can take a generating set X of A and apply the 
construction of Theorem 1.3.2. This yields 


Corollary 1.3.3. Any Q-algebra A can be expressed as a homomorphic image of an 
Q-word algebra Wo(X), for a suitable set X. Here X can be taken to be any set corre- 
sponding to a generating set of A. Es 


The £2-words may also be thought of as operations. Any word in x), ...,Xm € X 
(and elements of Q) may be regarded as an m-ary operation, called a derived 
operation. For example, in groups the commutator (x,y) = x~'y~ 'xy is a derived 
operation. The derived operations include the original operations w € Q, in the 


form x,...X,q@, as well as m operations x;(1=1,.... mm). They are the projection 
operators 

Xy.. AO = Xj}. (1.3.4) 
Moreover, we have a composition of operations: if f\....,f, are amy words in 
X}.---,;Xm and g is a word in n variables, then f,|...f,g is a word in x),....%», 
obtained by composition from f,..... fy. g. 


On any set A we can consider families of operations which include all projections 
and are closed under composition. Such a family is called a clone of operations on A. 
For example, if A is an Q-algebra we have the clone generated by &; this is the 
smallest clone including Q and is obtained by repeatedly composing the elements 
of Q2 and the projections. 

In studying Q-algebras we are often not interested in the precise operations &, but 
merely in the clone they generate, and they may be replaced by any other set of 
operations generating the same clone. For example, groups may be defined in 
terms of a constant operation e and the single binary operation xy” ', or in terms 
of e and the single ternary operation xy~'z, or even as a non-empty algebra with 
the single binary operation xj~', besides the usual ways. This raises the question 
of finding relatively simple sets of operations. 

Consider first unary operations. An operation w: A — A will be called essentially 
unary’ if it depends on at most one argument. More precisely, in terms of projections 
this means that there is a unary operation f : A — A and i, 1 <1 <n, such that 
«@ = 6,f. For example, each projection operator is essentially unary. It is not hard 
to verify that the clone generated by any set of essentially unary operations consists 
entirely of essentially unary operations. For an Q-algebra this case arises when 
Q = 2Q(0) UQ(1). Such algebras can also be characterized by the fact that the 
union of any two subalgebras is again a subalgebra. This shows for example that 
groups cannot be defined by unary operations alone (since the union of two sub- 
groups is not usually a subgroup). 


14 Universal algebra 


The distinction between binary and higher operations is much less precise, for as 
the next result, due to Wactaw Sierpinski [1945] shows, every finitary operation can 
be composed from binary ones. 


Theorem 1.3.4. Let A be a finite set. Then every finitary operation on A can be obtained 
by composition of binary operations on A. 


Proof. Suppose that |A| = n; we may without loss of generality regard A as the ring 
of integers mod n, Z/n. The ring operations on Z/n are at most binary, so it will be 
enough to show that every operation can be expressed in terms of the ring operations 
and the 5-function 

lt taser, 


O(x) = 1. DD 
i" | O> sf see 


This can be accomplished by a multivariable analogue of the Lagrange interpolation 
formula: given f(x,,....x,), we have 


f(xi....,%4) = Yo fla, Acton (sath. (1.3.6) 


where the summation is over all k-tuples (a;,....a,). It is of course important to 
realize that the a; on the right of (1.3.6) are parameters, not variables; thus ax, 
for any a € Z/n, can be built up from x by repeated addition and so is (for any 
given a) a unary operation. ee 


The theorem still holds when A is infinite, but the proof in that case is quite 
different and is based on the fact that there is then a bijection from A- to A, 
which can be used to reduce n-ary operations to binary ones (see Cohn (1981) and 
Exercise 3). 

When we come to define a concrete class of algebras such as groups, we do so by 
specifying its operations: yz binary, » unary and ¢ O-ary. The axioms for groups in 
terms of these operations take the form: 


(associativity) XX UXZU = XXXL, (1.3.7) 
(neutral) XE SENS: (1.3.8) 
(inverse) SX SX Se. (1.3.9) 


Actually these laws as stated are redundant: parts of (1.3.8) and (1.3.9) follow from 
the rest. This point is well known and does not concern us here. 

We see that the axioms take the form of equations holding identically for all values 
of the variables. Generally, by an identity or law over Q in X we understand a pair 
(u,v) € W~, or sometimes the equation formed from the pair: 


rk mes (1.3.10) 


We shall say that the law (1.3.10) holds in the Q-algebra A or that A satisfies (1.3.10) 
if every homomorphism W -—> A maps u and v to the same element of A, in other 
words, if u and v define the same derived operation on A. 


1.3 Free algebras and varieties 15 


The relation between laws and algebras establishes a Galois connexion between the 
set of all sets of laws in the given alphabet X and the class of all sets of 82-algebras. 
Given any set >° of laws, we can form ¥ Q( >_ ), the class of all 92-algebras satisfying 
all the laws in )°. This class 4 Q(>_) is called the variety generated by }~. For 
example, groups form the variety of (yu, v, €)-algebras generated by (1.3.7)—(1.3.9). 
Likewise rings form a variety, but fields do not. In the other direction we can 
from any set 6 of Q2-algebras form the set q(%) of all laws holding in all algebras 
of 6. Now our Galois connexion relates each variety of 82-algebras to a correspon- 
dence on Wo(X) of the form q(@). 

For any class 6 of 82-algebras its members will be called @-algebras. Our next task 
will be to determine the precise form of the set q(@). A subalgebra of A is called fully 
invariant in A if it is mapped into itself by all endomorphisms of A; this definition 
also extends to congruences on A, as subalgebras of A”. 


Theorem 1.3.5. Let W = Wo(X) be the Q-word algebra on an infinite alphabet X. 
Then the Galois connexion between 92-algebras and laws establishes a natural bijection 
between varieties of Q-algebras and fully invariant congruences on W. 


Proof. For any class 6 of &2-algebras let 6 * = q(€) be the set of all laws holding 
in all ¢-algebras, and for any set 5 of laws let )°* = %°Q( >) be the variety defined 
by 5°. We first show that © * is a fully invariant congruence on W. The congruence 
properties are clear: in every 6-algebra we have u = u for any u € W; if uw = v holds, 
then so does v = u, and if wu =v, v= w hold, then u = w holds too. Further, if 
u,=1,(i1=1, .,.m) are laws holding in A€ © and w € Q(n), then u,...u,0 = 
Vv) ...V,@ holds in A. Now let (u,v) € G* and let 6 be any endomorphism of W. If 
a: W -> A, where A € %, is any homomorphism, then so is 6a, whence uéa = véa. 
Thus the law u@ = vé@ holds in A, so (ub. v@) € 6* and this shows @* to be a fully 
invariant congruence. 
To complete the proof we have to show that 


Co ag (P3149 
for any variety ¢ and 
q* = gq (1.3.12) 


for any fully invariant congruence g on W. 

By the definition of a variety, £ = > °" for some S°CW-, hence 
y= ye" = °° = 7, which proves (1.3.11). To establish (1.3.12), we take a 
fully invariant congruence q on W and first show that 


W/q € q". (1.3.13) 


This will follow if we can show that all the laws corresponding to the elements of q 
hold in W/q. Let (u.v) € gq, let a: W— W/g be any homomorphism and denote 
the natural homomorphism W — W/gq by v. We shall define an endomorphism 
a’ of W such that 


wav=wa forall we W: (1.3.14) 


16 Universal algebra 


to do so we pick for each x € X an element x) € W such that xov = xq@ and define 
xa’ = x9. By Theorem 1.3.2 the mapping a’ : X — W so defined extends to a homo- 
morphism and (1.3.14) holds for all w € X; hence by Proposition 1.1.1 it holds 
generally. Now q is fully invariant, hence (u@’, va’) € q and so ua = ua’y = 
va’v = va, and this establishes (1.3.13). 

To prove (1.3.12), we note that in any case q*™* > q. If (u. v) ¢ q, then u = v is not 
a law in W/q, but W/q € q* by (1.3.13), and so (u, v) ¢ q**. Therefore equality holds 
in (1.3.12). B 


Let 6 bea class of 8&0-algebras. By a free 6-algebra on a set X we understand an 
algebra F in % with the following universal property: there is a mapping pw : X > F 
such that every mapping f from X into a @-algebra A can be factored uniquely by u 
to give a homomorphism from F to A, i.e. there exists a unique homomorphism 
f':F — A such that 


uf' =f. (1.3.15) 


Remarks 


1. If @ contains non-trivial algebras, then yz is an embedding. For, given a, b € X, 
a #b, we can map X to a G-algebra by a mapping f such that af + bf; hence 
by (1.3.15), aw 4 by. 

2. If 6 admits subalgebras, then the free %-algebra F is generated by the image Xu. 
For otherwise we could replace F by the subalgebra generated by Xjz; since F is 
unique up to isomorphism, it must itself be generated by Xu. Thus Xp generates 
F; it is called a free generating set. 

3. If 6 admits subalgebras, F is a free 6-algebra on X and X’ is a subset of X, then 
the subalgebra of F generated by X’ is the free €-algebra on X’. For this sub- 
algebra is easily seen to possess the universal property. 


Not every class has free algebras, but they exist in varieties, by our next result. 


Proposition 1.3.6. Let 4° be any variety of Q-algebras and q the congruence on 
W = Wo(X) consisting of all the laws holding in 4. Then W/q is the free ¥ -algebra 
on X. 


Proof. By (1.3.13), W/q is a ¥ -algebra, so it only remains to verify the universal 
property. Let us write »: W — W/gq for the natural mapping. Given any mapping 
f:X—A to a 7°-algebra, by Theorem 1.3.2 this extends to a homomorphism 
f:W-—A. Given u,v € W, if u=v(modq), then (u,v) is a law in 7° and so 
holds in A, hence uf = 1 if. Thus q € ker f, and by the factor theorem there is a 
homomorphism f': W/q — A such that f = vf’. If uw: X — W is the injection, 
we have f = uf = pvf', and f’ is unique, since it is given on a generating set of 
W/q. Thus W/q satisfies all the conditions for a free # -algebra. a 


1.3 Free algebras and varieties 17 


There is another way of forming free algebras, which leads to a useful criterion, 
due to Garrett Birkhoff, for a class of algebras to be a variety. 


Theorem 1.3.7. Let 6 be a class of Q2-algebras; G is a variety if and only if it is closed 
under the operations of taking subalgebras, homomorphic images and direct products. 


Proof. The necessity of the conditions is easy to check; given any S2-algebra A, it is 
clear that any subalgebra and any homomorphic image of A satisfy all the laws hold- 
ing in A. Moreover, if a law holds in every member of a family of §2-algebras, then it 
also holds in their direct product. This shows that every variety satisfies the given 
conditions. 

Conversely, let 6 be a class of Q2-algebras closed under subalgebras, homomorphic 
images and direct products. Then % contains the trivial algebra (as the direct product 
of the empty family). If there are no other algebras in 6, then we have the variety 
defined by the law x; = x.. So we may now assume that % contains a non-trivial 
algebra. We can form a free G-algebra on a given set X as follows. Consider the 
set of all G-algebras with a generating set of cardinal not exceeding that of X. 
Take all mappings f, : X — A,g, where A, is a G-algebra and Xf, a generating set 
of Ay, and in the direct product P = [| A, consider the subalgebra F generated by 
the elements (xf), x € X. As a subalgebra of the direct product, F is again in ©. 
We claim that F satisfies the universal property relative to the mapping 
li: x1-> (xf,). For if f : X — A is any mapping to a 6-algebra A and A’ is the sub- 
algebra generated by Xf, then the restriction f |A’ coincides with some f,, and so A’ is 
a homomorphic image of F, the mapping F — A’ being the projection on the appro- 
priate factor. Hence we have a homomorphism f' : F > A such that f = wf’ and f’ 
is unique since it is prescribed on a generating set of F. Thus F is the free 6-algebra 
on X. 

Clearly we have 


ae aa (1.3.16) 


and it remains to prove equality. Let q = @~ be the set of all laws holding in @. By 
Proposition 1.3.6, the free @-algebra is W/q. If A € 6 **, we can write A as a homo- 
morphic image of W, for an appropriate X, say f : W — A. By the definition of 
¢* = q", A satisfies all the laws of q, hence f can be factored by q; thus A is a homo- 
morphic image of W/q, the free -algebra on X, and A is therefore itself a @-algebra. 
Hence equality in (1.3.16) is established. ea 


We have already remarked that rings and groups are examples of varieties. We 
now see that fields (commutative or not) do not form a variety, since they do not 
admit direct products; for if E, F are any fields, their direct product E x F as a 
ring has zero-divisors and so cannot be a field. 


18 Universal algebra 


Exercises 


1. Show that if some operation symbols are written on the left and others on the 

right of their arguments, then ambiguities can arise. 

Let w=c,...cy be an Q-word of the form 1m, ..u,@(u, € W,w € Q(n)). 

Show that any proper subsequence cjc;,)...¢,, where j —1 < N — 1, which is 

itself an $2-word, can only occur within a single factor u,. 

3. Assuming a bijection yz : A~ < A between a set A and its Cartesian square A’, 
show that every n-ary operation @ on A can be expressed in terms of the 
binary operation yu and a suitable unary operation. 

4. Verify that the set of all essentially unary operations on a set is a clone. Deduce 
that any operation derived from essentially unary operations is again unary. 

5. Let A be any Q-algebra, X a set and for each x € X, let 5, : A* — A be the 
projection on the x-th factor. Show that the subalgebra of the direct power 
A*” generated by all the 6,(x € X) is the free algebra on X for the variety 
generated by the algebra A (1.e. the least variety containing A). 

6. Show that modular lattices form a variety. Similarly for distributive lattices, and 
Boolean algebras. 

7. Show that groups may be defined in terms of the operation xya = xy~! as non- 
empty algebras satisfying xzayzaa = xya, xxayyayaa = y. Show that abelian 
groups may be defined by xxyaa = y, xyaza = xzaya. 

8. Show that any variety of groups defined by a finite set of laws can also be defined 
by a single law. 

9. Show that the automorphism group of Wa(X) is isomorphic to the group of all 
permutations of X. 

10. Let ¢ bea variety of {2-algebras and F the free / -algebra on a set X. Given a 
homomorphism f :A— B between / -algebras which is surjective, and a 
homomorphism @: F — B, find a homomorphism 6’: F — A such that 
46= 6. (Hint. See the proof of Theorem 1.3.5.) 


to 


1.4 The diamond lemma 


In many algebraic problems the elements of a set are defined as equivalence classes of 
formal expressions, where two expressions are considered as equivalent if one can 
pass trom one to the other by a series of ‘moves’. The problem is to decide when 
two expressions are equivalent. For example, the elements of a particular 92-algebra 
A are given by Q-words in a generating set; the defining relations in A allow the 
passage between certain words and we have to decide when two given words repre- 
sent the same element of A (the word problem for A). 

The situation may be represented by a graph as follows: the vertices of our graph 
are the different formal expressions, and each move, from u to ¥ say, is represented 
by an edge from u to v. Now the equivalence classes are the connected components of 
our graph. Frequently the moves are of two sorts: direct moves (e.g. in a group, 
removing a factor xx~') and their inverses (inserting a factor xx! in a certain 
place); this means that we have a directed graph. An expression is reduced if it 


1.4 The diamond lemma 19 


admits no direct moves and the main result of this section, the diamond lemma, gives 
conditions under which each equivalence class contains a single reduced expression. 
The conditions are of a form that frequently applies, and it leads to a simple solution 
of our problem: To test if two expressions are equivalent we apply direct moves until 
each is in reduced form; if these reduced forms are equal, then and only then are the 
two expressions equivalent. 


Lemma 1.4.1 (Diamond lemma, M. H. A. Newman [1942]). Let A be a set with an 
equivalence relation defined on it by moves as above, such that the following conditions 
are satisfied: 


(i) Finiteness condition. For each u € A there exists an integer r = r(u) such that no 
chain of direct moves applied to u has more than r terms. 

(11) Confluence condition. If u € A can be transformed to x by one direct move and to y 
by another, then there exists v € A which can be reached from each of x, y by an 
appropriate series of direct moves. 


Then each equivalence class of A contains exactly one reduced element. 


Proof. By (i) we can from each element of A reach a reduced element by a finite 
series of direct moves, so each equivalence class contains a reduced element. Given 
u € A, suppose that we reach a reduced element a in m moves, passing through 
the elements u = a,@,,....@., =a and that u=bo.b,,...,b, =b is another 
such chain leading to a reduced element b in n steps; we have to show that a = D. 
By (ii) we can reach a common element cy by direct moves applied to a, or to 5, 
and by direct moves applied to cy we reach a reduced element c. We shall use induc- 
tion on the least value of r(u). Clearly r(a,) < r(1), hence by induction we have 
c = a, and similarly c = b, therefore a = b, as claimed. |» | 


A typical application is the existence proof of a normal form for the elements of a 
free group (see Exercise 4 and Chapter 3). For a discussion of the applications to 
rings, with many illuminating examples, see Bergman [1978]. 


Exercises 


1. In Lemma 1.4.1(ii) assume that if u is transformed to x by one direct move and to 
y by another, where x # y, then there exists v € A which can be reached from each 
of x, y by just one direct move. Show that all reduction chains from a given 
element to a reduced element have the same length. Show that the extra condition 
cannot be omitted. 

. (M. H. A. Newman) Show that the conclusion of Lemma 1.4.1 still applies if (ii) 
holds but instead of (i) we have merely the minimum condition: no element 
admits an infinite succession of direct moves. (Hint. Repeat the construction in 
the proof of Lemma 1.4.1 and use the minimum condition.) 

3. Let A be a ring with an endomorphism a. Show that in the ring R generated by A 

and a symbol x satisfying ax = xa® for all a € A, every element can be uniquely 


i) 


20 Universal algebra 


written as a polynomial in x : yx! a; (a, € A), and hence that A is embedded in R 
(see Section 7.3 below). 

4, Let X,X' be disjoint sets with a bijection x< x’ between them. Write 
Z =X UX’ and on the set Z* of all strings of letters from Z (including the 
empty string 1) define a product by juxtaposition (this is just the free monoid 
on Z, see Section 11.1 below). Define direct moves as the replacement of fxx'g 
or fx'xg by fg and their inverses as inverse moves. Apply the diamond lemma 
to deduce the existence of a normal form for the elements of a free group (see 
Section 3.5 below). 

5. Let S be a semigroup (system with an associative multiplication) without idem- 
potent (i.e. x” # x for all x € S) and satisfying 


uwa=ubsava=vb forall u,v,a,beS. (1.4.1) 


By adjoining formal solutions of the equations 
xa=b a.besS. (1.4.2) 


show that S can be embedded in a semigroup in which (1.4.2) has a solution for 
all a, b. (Hint. Adjoin a new symbol! p to S and consider all words in SU {p} with 
direct move pa — b. Verify the conditions of Lemma 1.4.1 and show that distinct 
elements of S cannot be equivalent. Now show that the resulting semigroup again 
satisfies (1.4.1) and repeat the process (see Cohn [1956]).) 


1.5 Ultraproducts 


Let us again consider the direct product construction. Given a direct product 
P=[[A, of Q-algebras, we have seen that if a law u =v holds in each factor A, 
then it holds in the product. On the other hand, consider the statement occurring 
in the definition of a field: 


for all a ~ 0 there exists a’ such that aa = 1. (1.5.1) 


This may well hold in each factor A; and yet fail to hold in the direct product, as we 
see by taking the direct product of two fields; the element (1, 0) is different from 0 
but does not have an inverse. 

In order to remedy the situation we introduce certain homomorphic images of 
direct products, called ultraproducts, which have the property that every sentence 
of first-order logic which holds in all the factors, also holds in the product. For a 
complete proof we would need a detailed description of what constitutes a sentence 
in first-order logic, i.e. a sentence without free variables, and in which all quantifica- 
tions are over object variables (an ‘elementary sentence’), and this would take us 
rather far afield. However, the construction itself is easily explained and has many 
uses in algebra. We describe it below and refer for further details to Bell and Slomson 
(1971), Barwise (1977) and Cohn (1981). 

We shall need the concept of an ultrafilter. Let ] be a non-empty set. By a filter on 
I one understands a collection ¥ of subsets of I such that 


1.5 Ultraproducts 21 


FlIEF,O¢F. 
F.2 if X.Ye#,thnXnYeF, 
F.3 ifxXeF andX CX'CI,thenX' e€F¥ 


For example, given a subset A of I, if A # @, then the set of all subsets of IJ contain- 
ing A isa filter, called the principal filter generated by A. More generally, if (A;,) is any 
family of subsets of J, then the collection of all subsets containing a set of the form 


Ay Voc An. (1.5.2) 


forms a filter, provided that none of the sets (1.5.2) is empty. This condition on the 
A;, is called the finite intersection property. Thus any family of subsets of J with the 
finite intersection property is contained in a filter on J. Such a family is also called 
a filter base. 

An ultrafilter on I is a filter which is maximal among all the filters on J. An alter- 
native characterization is given by 


Lemma 1.5.1. A filter ¥ on I is an ultrafilter if and only if for each subset A of I, either 
A or its complement A’ belongs to F. 

Of course A,A’ cannot both belong to ¥, because then -* would contain 
B=ANA’ 


Proot. Let ¥ be an ultrafilter. If A ¢ ¥, then by F.3, no member of ¥ can be con- 
tained in A and so each member of ¥ meets A’. It follows that the family of all sets 
containing some FQ A'(F € ¥) is a as containing -¥, but then it must equal .¥ by 


maximality of the latter, so A’ € ¥. Conversely, if for each A CI, either A or A’ 
belongs to ¥, consider a filter *, D ¥ and Be F,\¥F. By assumption, B’ c ¥ 
and so @ = BM B’' € F, which is a contradiction. Ci 


The existence of ultrafilters is clear from Zorn’s lemma: 


Theorem 1.5.2. Every filter on a set I is contained in an ultrafilter. 


Proof. Let ¥ be a filter on J and consider the set of all filters containing -*. This set 
is easily seen to be inductive, hence it has a maximal member, which is the required 
ultrafilter. < 


For example, the principal filter generated by a one-element subset is an ultrafilter. 
When I is finite, every ultrafilter is of this form, but for infinite sets we can always 
find non-principal ultrafilters as follows. Let J be an infinite set and call a subset 
cofinite if it has a finite complement in J. The collection of all cofinite subsets of J 
clearly has the finite intersection property and so is contained in an ultrafilter, 
and the latter is easily seen to be non-principal. 

We shall use filters to construct certain homomorphic images of direct products. 
Let Aj(1 € J) be a family of Q2-algebras and let ¥ be any filter on the index set J. 
Then the reduced product 


] [4/7 (1:53) 


22 Universal algebra 


is the homomorphic image of the direct product P = || Aj, defined by the rule: 
forany x,yEePx=y Ss {(1ell|xaj =yrj}eF, (1.5.4) 


where 7; is the projection on Aj. Let us call a subset of J ¥-large if it belongs to ¥. 
Then the definition states that x = y iff x and j agree on an ¥-large set. We have to 
verify that we obtain an §2-algebra in this way, i.e. that the correspondence defined on P 
by (1.5.4) is a congruence. Reflexivity and symmetry are clear and transitivity follows 
by F.2. Now take w € Q(n) and let x, =y,(v=1,...,), say x, and y, agree on 
A, € ¥. Then A; N...NA, € F and on this set x)...x,w and j,...),@ agree. 
Thus we have 


Theorem 1.5.3. A reduced product of §2-algebras is an Q2-algebra. + | 


More generally this holds for any /¢ -algebra, where ¥° is a variety, because the 
reduced product is a homomorphic image of a direct product. 

A reduced product formed with an ultrafilter is called an ultraproduct, or ultra- 
power if all factors are the same. Now the ultraproduct theorem for 92-algebras 
asserts that an ultraproduct []A;/¥% of Q-algebras formed with an ultrafilter ¥ is 
again an 82-algebra; moreover, any elementary sentence holds in the ultraproduct 
precisely if it holds in each factor A, for 7 in an ¥-large set. As already indicated, 
we shall not prove the full form, but merely a special case, which illustrates the 
method and which is sufficient for our purposes. 


Theorem 1.5.4. Any ultraproduct of skew fields is a skew field. 


Proof. Let D;(i € I) be a family of skew fields and K = |[D,/F their ultraproduct, 
formed with an ultrafilter ¥ on I. By Theorem 1.5.3 and the remark following it, K is 
a ring. Let a € K and suppose that a # 0. Taking a representative (a;) for a in || D;, 
we have 


because a # 0. Therefore its complement J’ belongs to .4 by Lemma 1.5.1 and we 
can define b, by 


ie ifie]’. 
yy if res. 


Let us denote the image of (b;) in K by b. Since a,b, = bja; = 1 forie J’ and J’ €.F, 
we find that ab = ba = 1. Hence every non-zero element of K has an inverse and so 
K is a skew field, as claimed. | 


It is instructive to take a reduced product and see where the proof fails; it was for 
(1.5.5) that we needed the property of ultrafilters singled out in Lemma 1.5.1. 

To illustrate Theorem 1.5.4 and the ultraproduct theorem mentioned earlier, let 
us take a sentence W in the language of fields and suppose that we can find fields 
of arbitrarily large characteristic for which W holds. Then W also holds in their ultra- 
product, and this will be of characteristic 0, if it was formed with a non-principal 


1.5 Ultraproducts 23 


ultrafilter. For suppose that k, is a field of characteristic p;, where p) < p2 <... and 
~p; > © asi — oo. Then the sentence 


®,:1l4+1+...4+1=0 
—_—_—_—_———_ 


nN 


holds in only finitely many of the k; for each n, and hence its negation —®,, holds in 
their ultraproduct; this shows the latter to be of characteristic 0. Since VY holds in 
each k,, it also holds in the ultraproduct. Thus we have 


Proposition 1.5.5. An elementary sentence which holds in a field k; of finite character- 
istic p;, where the p, are unbounded, also holds in certain fields of characteristic 0. & 


For example, consider the statement: every non-degenerate binary quadratic form 
is universal. This may be stated as 


Va.b.cdax.ylaZO0Ab40=> ax + by =c). 


It holds for all finite fields of characteristic not two (see BA, Section 8.2); hence it 
also holds in certain fields of characteristic 0. 

As a second illustration we observe that a field of characteristic p may be defined 
by the sentence ‘>, A ®,”. Hence a field of finite characteristic is defined by the 
‘infinite disjunction’ 


AP, A[®,.VO,VOsV...]. 


This is not an elementary sentence as it stands. But we can assert further that it is not 
equivalent to any set of elementary sentences. For if it were, it would hold in all fields 
of finite characteristic and hence also in some fields of characteristic 0, which is 
clearly not the case. 


Exercises 


1. Show that any ultrafilter which includes a finite set must be principal. 

. Let A be an infinite set. Show that for every non-empty subset B of A there is an 
ultrafilter ¥ , including B. What is the condition on B for .¥ g to include all cofi- 
nite subsets? 

3. Let Ibe a set and A(I) the Boolean algebra of all subsets of J (see BA, Section 3.4). 
Defining ideals of Boolean algebras as inverse images of 0 in homomorphisms, 
show that a filter on J is just the complement of a non-zero ideal in A(J). 
Which ideals correspond to ultrafilters? 

4, Show that any formula ®(x) holds in an ultraproduct [[A;/¥ iff it holds in 
all the factors A, for an #-large set of indices. (Hint. Verify that the formulae 
for which this is true include all atomic formulae and are closed under 
V,A,7.¥.4. Hence the result holds for sentences, i.e. formulae without free 
variables.) 

5. (Compactness theorem of model theory) Let 7 be a set of elementary sentences 
about 92-algebras. Show that if each finite subset P of 7 has a model (i.e. there is 


tu 


24 Universal algebra 


an algebra in which each sentence of P holds), then 7 has a model. (Hint. For 
each P C F take a model Ap and form a suitable ultraproduct of the Ap) 

6. Show that an integral domain R which is embeddable in a direct product of skew 
fields is embeddable in a skew field. (Hint. Let K; (1 € I) be the family of skew 
fields and for c € R* let I, be the set of indices 1 for which c¢ is inverted in K;. 
Verify that the J. form a filter base.) 


1.6 The natural numbers 


In a first approach to mathematics one usually takes the natural numbers for 
granted, but for a rigorous development it is necessary either to provide an axiomatic 
foundation for the natural numbers, or to deduce their properties from some other 
domain such as set theory. The latter alternative would of course make it necessary to 
include an axiomatic foundation of set theory and this would involve us far deeper in 
foundational questions than is appropriate here. Such a study would occupy a whole 
volume by itself and would not greatly help us in our understanding of algebra. 
We shall therefore confine ourselves to a derivation of the properties of the natural 
numbers from the Peano axioms and a brief discussion of their significance, as well 
as their relevance to algebra. As we shall see, the framework of universal algebra is 
particularly appropriate for this purpose. 

We begin by writing down a system of axioms for the natural numbers. This is not 
so much to give a rigorous foundation as to make explicit the properties of numbers 
we are using. The notions of set theory will be used freely, in the intuitive form intro- 
duced in BA. We shall also use individual numbers (e.g. to label the axioms) without 
hesitation; no axioms are needed for the numbers up to 12, or for that matter up to 
10'%. The purpose of the axioms is to allow us to deal with the set of all numbers. 

The axioms, as stated essentially by Giuseppe Peano in 1889, are: 


N.1 1 is a natural number. 

N.2 Every natural number n has a successor n’, which 1s again a natural number. 
N.3 1 is not the successor of any number. 

N.4 Distinct numbers have distinct successors: m An > m ~<un. 

N.5 (Principle of induction) A set of numbers containing 1 and with each nusber its 
successor contains all numbers. 


The set of all natural numbers will be denoted by N. We can think of N as an algebra 
with a single unary operation, the successor furiction x i x’. Let us call an algebra with 
a single unary operation an induction algebra. By N.2, N is an induction algebra; 
moreover it is generated by the single element 1, by N.5. To elucidate the structure 
of N we begin with a general lemma on induction algebras. 


Lemma 1.6.1. Let A be an induction algebra. Then the subalgebra B generated by an 
element b of A consists of b and successors of elements of B. 


Proof. The set B, consisting of b and successors of elements of B is contained in B; it 


1.6 The natural numbers 25 


contains b and the successor of any element of B, and so is a subalgebra. Since B is 
the least subalgebra containing J, it follows that B; = B, as claimed. B 


For example, if we take A = N, b = 1 and remember N.5, we see from the lemma 
that every number different from 1 is the successor of a number. Thus if n ¥ 1, then 
there is a number which we shall denote by # — 1 such that (n — 1)’ = n. By N.4, 
n — | is uniquely determined by n; it is called the predecessor of n. 

For any n € N we denote by [n| the subalgebra generated by n. 


Lemma 1.6.2. For all n € N, we have n ¢ [n’|. 


Proof. Let us first show that 
n xn. (1.6.1) 


For n = 1 this holds by N.3; if it holds for any n ¥ 1, then it holds for n’ by N.4, 
hence it holds for all , by induction (i.e. N.5). 

Now by Lemma 1.6.1, [1'] consists entirely of successors of numbers, whereas | is 
not a successor, hence 1 ¢ [1’|. Suppose now that n ¢ [n’| but that n’ € [n”]. By 
(1.6.1), n° An”, so n’ must be the successor of an element in [n”|; but this can 
only be n (by N.4), son € [n”| and [n"| C [n'|, therefore n € [n'|], which contradicts 
the hypothesis. By induction we conclude that n ¢ [n'| for all n. > | 


Let us write |] for the complement of [n'] in N. By Lemma 1.6.2, n € |n]; the 
elements of jn] other than will be called the antecedents of n. When n $ 1, they 
clearly include the predecessor n — 1 of n. With these preparations we can prove a 
result on which the box principle is based. 


Theorem 1.6.3. Let m,n € N. There is an injective mapping from |m|] to |n] if and only 
if |m] C |n]. Further there is a bijection between |m]| and |n] if and only if m =n. 


Proof. If |nz] © |n], then the inclusion mapping is the required injection, and for 
m =n this is a bijection. Conversely, assume that f : jm] — |n] is an injective map- 
ping; we must show that |] C |n}. When m= 1, then since | ¢ [n’|, we have 
1 € |n] and so |1] C |n]. We may therefore assume that m 4 1 and use induction 
on m. Since m1, there is a predecessor m—1; we define a mapping 
gz: |m—1| — |n] by the rule 


, kf if kf An, 
y= | 
mf if kf = 101. 

To check that this is a well-defined mapping we note that there is at most one 
number & such that kf = n, because f is injective. Denote this number by ky; if 
ky = m or ky is not defined (because the image of f does not include nm), then g is 
just f restricted to |m — 1]. Otherwise g differs from f only at ky and there it has 
the value mf which it assumes nowhere else, for the domain of g does not include 
m. Thus g is well-defined; moreover g is injective and it does not assume the value 
n. It follows that n #1 and that g is an injective mapping from |m— 1] to 


26 Universal algebra 


ln — 1}. By the induction hypothesis, |r — 1] C Jn — 1]; taking successors on both 
sides, we obtain |] C |n]. 

If there is a bijection between |m] and |n], we conclude that |m] = |n] and there- 
fore m =n, because n is the unique member of |n] without a successor. |» | 


A set S is called finite if there is a bijection between S and |n], for some n EN. 
By what has been said, there can be at most one such n and this is called the 
cardinal of S. Thus for any finite set there is a natural number which is its cardinal. 

Theorem 1.6.3 leads to the familiar ordering of the natural numbers: we write 
m <n to mean that m is an antecedent of n, or equivalently, |r] ¢ |n}. It is clear 
that this relation is reflexive and transitive, and by the last part of Theorem 1.6.3 
we see that m < n,n < 1m implies m =n. Thus we have a partial ordering. From 
the definition it is clear that m <n implies mm’ <n’ and it is easy to show that 
‘<* is a total ordering. Given m,n € N, if m= 1, then clearly m < n; similarly if 
n= 1, then n < m. Now if m.n 4 1, we can form m— 1, n— 1 and by induction 
either m—1<n-—1orn—1<m-—1. Taking successors we find that m <n or 
n<m. We shall also adopt the usual notation of writing m <1 or n> 1m to 
mean ‘m <n but m An and m > nto mean nt < m. 

In contrapositive form Theorem 1.6.3 shows that if mm ¢ |], so that m > n, then 
there can be no injective mapping from {m] to |n]. In particular, taking 
m=n(=n-+1), we see that when n’ objects are distributed over n boxes, at 
least one box contains more than one element. This is just Dirichlet’s Box Principle, 
already encountered in BA, p.2, where it was stated without formal proof. 

The natural numbers have another property not shared by all ordered sets; they are 
well-ordered, i.e. every non-empty subset of N has a least element. Given @ C SCN, 
let T be the set of numbers m such that m < 1 for all # € S. Clearly 1 € S; we claim 
that there is a number a € T such that a’ ¢ T. For if a’ € T for all a € T, then by 
induction T = N and S must be empty, a contradiction. Hence there exists a € T 
such that a’ ¢ T and it follows that a is the least number in S, since a < n for all 
n © S, and a € S since otherwise a’ € T. This proves 


Theorem 1.6.4. The set N of natural numbers is well-ordered. a 


Our next task is to establish a universal property of N, which forms the basis of the 
process of definition by recursion. 


Theorem 1.6.5. N is the free induction algebra on 1. Thus if A is anv induction algebra 
and ae A, there is a unique homomorphism a: N — A such that le = a. 


Proof. In detail the assertion states that A is a set with a single unary operation 
x! x’, and given a € A, there is a unique mapping x!~ x@ from N to A such that 


le=a, x'a—(xa) forallx EN. (1.6.2) 


By Proposition 1.1.1 there can be at most one such mapping. To prove its existence 
we form the direct product N x A; this is again an induction algebra, with the opera- 
tion (x.y) = (x’.¥’). Let H be the subalgebra of N x A generated by (1. a); further 


1.6 The natural numbers 27 


denote by p, q the projection mappings of N x A on the factors N and A respectively, 
and by p,q, their restrictions to H. 

The image of H under p, is a subalgebra of N containing 1, and hence is N itself, 
by N.5. Thus for each x EN, 


there exists y € A such that (x.y) € H. (1.6.3) 


We claim that for each x € N there is exactly one such y. For by Lemma 1.6.1, H 
consists of (1,4) together with successors of members of H. Any successor 1s of 
the form (x', y’); here the first component is different from 1, by N.3, so the y deter- 
mined in (1.6.3) by x = 1 is unique. Let N; be the subset of all x € N occurring as the 
first component of just one member of H, i.e. for which the y € A obtained in (1.6.3) 
is unique. As we have seen, 1 € N;; if we can show that N, contains with each 
element its successor, it will follow from N.5 that N; = N. 

Let x € N; and suppose that y;,y2 © A are such that (x’,y,) €e H(1= 1,2); we 
have to show that y,; =y>. Since x' #1, (x',y,) is a successor in H, say 
(x', pj) = (,. vj)" = (u,v), where (1), v;) € H. Equating first components, we find 
x St, =, and. by Nay x=) = my Le. (441) eH for r= 1,2. But 4 e-N; 
and this means that 1; = 1; hence v, =v, and so y; =}, as we had to show. 
Therefore x’ € N; and by N.5 we conclude that N; =N. 

We have now shown that for each x € N there is a unique y’€ A such that 
(x, ¥) € H. Writing xq@ for y, we find that la =a, x’@ = (xa), so (1.6.2) holds 
and the proof is complete. a 


Functions on N are frequently defined recursively, for example the sum of the 
squares of the first » natural numbers may be defined as the function g: N > N 
such that 


el) =1. glint 1) =g(n) t+ (nt ly. 


It may appear intuitively obvious that this defines a function, but this needs to be 
proved. The basic reason is that N is the free induction algebra on 1, as we shall 
now see. The statement in Theorem 1.6.6 is a little more general, but the proof 
begins with the case exemplified above. 


Theorem 1.6.6. Given a € N and any function from N to N, there exists a unique func- 
tion y from N to itself, satisfying the equations: 


g(l) =a, y(n) =f(n. g{n)). (1.6.4) 


Proof. Suppose first that fis independent of its first argument. Then we have to find 
py: N— N to satisfy 


g(1) =a. y(n’) = f(y(n)). (1.6.5) 


In this case the result follows immediately from Theorem 1.6.5, taking A there to be 
the set N with f as its successor function. 


28 Universal algebra 


In the general case we take A = N~ with successor function 
(x,y) > (x, f(x,y), 


and apply Theorem 1.6.5 with the element (1. a) in place of a. We obtain a mapping 
y:N—>N>; if its projections on the factors are y; and g>, then g)(1) = 1, 


/ 


y(n’) =n’, hence g;(x) = x for all x € N, and now 
g(1) =a. go(n’) = f(n, yr(n)). 
Thus > is the required function. + | 


This result emphasizes the difference between ‘proof by induction’ and ‘definition 
by recursion’. Whereas the former embodied N.5, the latter also relies on N.3, N.4, 
and a further argument (such as that leading to Theorem ].6.6) is needed to prove it. 
From the algebraist’s point of view the situation can be summed up by saying that 
a proof by induction depends on the fact that N, as induction algebra, is generated 
by 1, whereas the method of definition by recursion depends on the fact that N is 
generated freely by 1. 

As an application let us see how Theorem 1.6.6 may be used to define addition and 
multiplication on N. Given a € N, there is a mapping a, : N — N such that 


la, =a’. XO, = (xu). 
If we write a+ x in place of xa,,, these equations take on a more familiar form: 
atl=a’. atx’ =(a+x). (1.6.6) 


From this definition it is easy to prove the associative and commutative laws of 
addition: 


(a+b)4+eom=at(b4+e). (1.6.7) 


atb=b-+a. (1.6.8) 


We shall prove (1.6.7) as an example and leave (1.6.8) to the reader. For c = 1, 
(1.6.7) reduces to (a+b) +1 =a+(b+1), ie. (a+b) =a+b’, which is true 
by the definition (1.6.6). If we assume that (1.6.7) holds for c=n, then 
(at+tb)+n' = [(a+b) +n] = [a+(b+n)]| =at(b4+n) =at+ (bn), — by 
(1.6.6) and the case c = n. Hence (1.6.7) holds for c =n’, and so by induction it 
holds for all n. 

Similarly we can define multiplication by constructing for each a € N, a mapping 
tg i NN such that ly, =a, x uy, =XUg +a. The existence and uniqueness 
follow again from Theorem 1.6,6, and we write as usual xu, = ax, so that the defini- 
tion takes the form 


al=a,a(x+1)=ax+a. (1.6.9) 


The associative and commutative laws can again be proved without difficulty, as well 
as the distributive law. If we adjoin a new element, denoted by 0, to N, to satisfy 
a+0=a,a0 = 0, we have a monoid under addition and the usual procedure for 


1.6 The natural numbers 29 


embedding a commutative cancellation monoid in a group enables us to embed N in 
a group, which is the additive group of integers Z. All that is needed is a proof of the 
cancellation law: a+ c= b+c = a= 5; this presents no difficulty and may be left 
to the reader. Now the multiplication on N can easily be extended to the set Z of all 
integers and the distributive law verified; thus we have obtained a ring structure on 
Z. It is also not difficult to extend the ordering of N to Z; it is still preserved under 
addition but only under multiplication by a positive number (i.e. an element of N). 

Let us consider the Peano axioms from the point of view of first-order logic. We 
remark that N.1—-N.4 are elementary sentences, whereas N.5 is not. Without going 
into detail this can be explained by saying that when expressed formally, N.5 involves 
quantification over sets of numbers and not merely numbers. It is easy to give 
examples of structures other than N which satisfy N.I-N.4. For example, take the 
set T consisting of the disjoint union of two induction algebras, T, isomorphic to 
N and T> isomorphic to Z. Then T satisfies N.1-N.4, though of course not N.5. 

This leaves open the question whether it may not be possible to replace N.5 by 
elementary sentences, so as to characterize the natural numbers by elementary 
sentences alone. This question can be answered negatively by forming an ultrapower 
I of N with a non-principal ultrafilter. By the ultraproduct theorem (see Theorem 
1.5.3 and the remarks following it), J is again an induction algebra and satisfies all 
the elementary sentences holding in N, but J is not isomorphic to N, because it is 
again totally ordered, by the rule: x < y iff x, < y, for all components in a large 
set, but unlike N it is not well-ordered. To see this, we consider the sequence 
a, = (a,,) of elements of J, where a,,, = 1 and for r > 1, 


Ars diy = Be ae 1)]. 


where |x] denotes the greatest integer <x. Thus a, =(1.2.3.4....), 
S01. 1e2, 2.3, 364, 4.464) ag Se 1 22, 25323. 35.4) ete. Tis isan inh 
nite strictly descending sequence, because the set {n € Nla,+1,, > ar.,} is finite, for 
r= 1,2...., and it shows incidentally that being well-ordered is not an elementary 
property. 

An induction algebra A satisfying the same elementary sentences as N is said to be 
elementarily equivalent to N, or also a model of N; if A is not isomorphic to N, it is 
called a non-standard model. Such non-standard models of N allow one to introduce 
‘infinite’ numbers, just as a non-standard model of the real numbers R may contain 
‘infinitesimal’ numbers. Non-standard analysis is a powerful too! with many appli- 
cations, but it lies outside the scope of this work (see e.g. Robinson (1963), Stroyan 
and Luxemburg (1976), Barwise (1977)). 


Exercises 


1. Prove the commutative law of addition in N. 

2. Prove the associative and commutative laws of multiplication, the distributive law 
and the cancellation law in N. 

3. Give a direct proof that an induction algebra generated by a single element 1 satis- 
fies either N.3 or N.4. 


30 


Universal algebra 


. Show that in any induction algebra the union of two subalgebras is again a sub- 


algebra. 


. Let f be the function on the non-negative integers, defined by f(0) = 0, f(x’) = x. 


Describe the function g(x, y) defined by g(x, 0) = x, g(x, y') = f(g(x, y)). 


. Define the natural ordering on Z in terms of the ordering on N and prove its 


compatibility with addition and multiplication by positive numbers. 


. Show that there is no total ordering on Z/m, the set of integers mod m, preserving 


addition and multiplication. 


. Use Theorem 1.6.3 to give a proof that N is not finite. 
. Give a direct proof by induction that there exists no surjective mapping from |] 


to |n] ifm <n. 


Further exercises for Chapter 1 


l. 


to 


Let ¢ bea variety of (2-algebras and assume that there is a ¢ -algebra C whose 
carrier is finite with more than one element. Let F,, be the free ¢ -algebra on an 
n-element generating set; by establishing a correspondence between C" and 
homomorphisms from F,, to C, show that any generating set of F,, has at least 
n elements. Deduce that F,, and F,, are not isomorphic when m # n. 


. Show that the free distributive lattice on three generators has 18 elements. (Hint. 


Form the different expressions in x), x2, X3; try cases of one and two generators 


first.) 


. Show that the free distributive lattice with 0, | on n free generators is 2°. 
. Define N as an algebra with the single unary operation A by the rule 


—] if lis 
mal" if n> 
l in.) 


Show that any non-trivial homomorphic image of N is isomorphic to N, but that 
N is not simple. 


. Let p, be the number of equivalence relations on a set of m elements. Obtain the 


following recursion formula for p,: 
n 
Pn+) = (7 )p. Po = I. 


Show also that )— p,,x"/n! = exp |( exp x) — 1]. 


. (G. M. Bergman) On an infinite-dimensional vector space V, define a filter of 


subspaces as a set of subspaces of V satisfying F.1—F.3, (where F.1 now reads: 
Ve#,0¢ #), and define an ultrafilter again as a maximal filter. Show that 
if.% is an ultrafilter, then every subspace of 7° either lies in or has a com- 
plement in ¥%. Let .o be the set of linear maps V — V which are ‘continuous’, 
le. the inverse image of an ¥ -space is an ¥ -space, and let .1 be the set of maps 
with kernel in ¥%. Show that . 1° is an ideal in -c/ and that .o//.1° is a skew field. 


. A ring R is called prime if R40 and aRb = 0 implies a=0 or b=0 (see 


Chapter 8 below). Show that an ultraproduct of prime rings is prime, but an 


1.6 The natural numbers 31 


10. 


ultraproduct of simple rings need not be simple. Is an ultrapower of a simple 
ring necessarily simple? What about 90,,(K), where K is a skew field? 


. Let k be a field and ¥ an ultrafilter on N. Show that there is a monoid homo- 


morphism |] 9,,(k)/Z —> k*/Z, whose kernel is an ideal. 


. For any set S denote by {S} the set whose single member is S (as usual). Let M be 


the induction algebra generated by @ in this way, with successor operation 
S’ = SU {S}. Show that M satisfies N.1-N.5 with @ in place of 1. 

Use Theorem 1.6.5 to define exponentiation on N by a! = a, a? = a.a’. What 
goes wrong if we try to define this operation on Z/m? 


Homological algebra 


The present chapter serves as a concise introduction to homological algebra. Only 
the basic notions of category theory (treated in BA) are assumed. The definition 
of abelian categories (Section 2.1) and of functors between them (Section 2.2) is 
followed by an abstract description of module categories in Section 2.3. A study of 
resolutions leads to the notion of homological dimension in Section 2.4; derived 
functors are then defined in Section 2.5 and exemplified in Section 2.6 by the 
instances that are basic for rings, Ext and Tor. Universal derivations are used in 
Section 2.7 to prove a form of Hilbert’s syzygy theorem. 


2.1 Additive and abelian categories 


We have met general categories in BA, Section 3.3, but most of the instances have 
been categories of modules or at least categories with similar properties. For a general 
study we shall therefore postulate the requisite properties; this will lead to additive 
categories, and more particularly, abelian categories. Later, in Section 2.3, we shall 
see what further assumptions are needed to reach module categories. 

We recall that an object I in a category .e/ is called initial if for each »/-object X 
there is a unique morphism | -> X; dually, if there is a unique morphism X — I for 
each object X, then I is called a final object. As we saw in BA, Section 3.3, an initial 
(or final) object, when it exists, is unique up to a unique isomorphism. For example, 
the category Rg of rings and homomorphisms has the trivial ring, consisting of 0 
alone, as final object, and Z, the ring of integers, as initial object. Initial objects 
arise in the solution of universal problems. Thus let .o/ be a concrete category, i.e. 
a category with a forgetful functor U to Ens, the category of sets and mappings, 
which is faithful, and denote by UX the underlying set of an .e/-object X. We fix 
a set S and consider the category (S,U) whose objects are mappings S > UX 
(X € Ob.s/) and whose morphisms are commutative triangles arising from an 
</-map f : X — Y by applying U. This is the comma category based on S and U. 
An initial object in this category is said to have the universal property for the set S. 
For example, in the category of groups, the free group on S has the universal 
property for S. 


34 Homological algebra 


— Uf 


Let .o/ be a category and (X,) any family of .e/-objects. Given .e/-objects P, Y, each 
family of maps 2; : P > X; gives rise to a natural mapping 


W(Y,P) > [ |“. %). O41) 


where gi— gm; and |] is the usual Cartesian product. When (2.1.1) is bijective for 
each Y we call P a product of the X; with natural projections z,. The product P 
with its maps mz, can also be described as the solution of a universal problem, for 
it is the final object in the category whose objects (A, f;) are families of maps 
f, : A — X; and whose morphisms y: (A, f;) > (B. g;) are families of commutative 
triangles, thus y: A > B satisfies f, = wg;. It follows that the product, when it 
exists, is unique up to isomorphism. We shall denote it by | | X;; thus we have an 
isomorphism, natural in Y: 


#(¥,T]%) ~T] vx). (219) 


Here || on the left denotes the product just defined, and on the right the usual 
Cartesian product of sets. 


Examples 


1. In Ens, [] reduces to the Cartesian product. Likewise in Ab, the category of 
abelian groups, or more generally, in Mody, the category of right R-modules, 
| | is the direct product introduced in BA, Section 4.2. 

2. In the category of all abelian torsion groups, |] X, is the torsion subgroup of the 
ordinary direct product of the X,. 

3. In the category of all finite abelian groups the product does not exist; this is easily 
seen by taking an infinite family of non-trivial groups. 

4. The product of the empty family (in any category where it exists) is the final 
object in the category. For here we have on the right of (2.1.1) the empty product, 
which by convention is a 1-element set. 


There is a dual construction, the coproduct: given any family (X;) of .e/-objects, their 
coproduct, also called sum, is an ./-object S with maps yu; : X; > S, called the natural 
injections, such that (S, ,) is a product of the X, in the dual category .«/°. For an 
explicit definition we need only reverse the arrows in the definition of the product. 
Thus (S, ;) is a coproduct if for any .°/-object Y the natural mapping 


sf(S,Y) > | | +/(X. Y) 


2.1 Additive and abelian categories 35 


given by f | pf is a bijection. As before, the coproduct, when it exists, is unique up 
to isomorphism; denoting it by | | X;, we have the isomorphism 


(LX. Y) ~ T] (x), ¥). 


For example, in Ens the coproduct is the disjoint union, while in Ab or Modg it is the 
direct sum (see BA, Section 4.2). As usual, a power of M is a product of copies of M; 
similarly a copower is a coproduct (or sum) of copies of M. In particular, for a finite 
family of modules the product and coproduct are the same, except for the associated 
mappings (which go in opposite directions). The connexion between these two 
concepts becomes clearer in additive categories. 


Definition. A category .s/ is said to be additive if 


Ad.1 each ./(X.Y) is an abelian group for an operation written +, 
Ad.2 the composition a, Bi— aB is biadditive, i.e. 


(ata )B=ap+a'B.a(p+ Bp’) =apt+eap'. (2.1.3) 
Ad.3 each finite family has a product and a coproduct. 


By applying Ad.3 to the empty family of objects we see that each additive category 
has an initial and a final object. In Ad.3 it is enough to assume the existence for 
pairs of objects and the empty family; the full strength can then be recovered by 
an easy induction argument. We also observe that axioms Ad.1—Ad.3 are self-dual. 
Further we remark that in Ad.3 it is enough to demand the existence of products 
(or only coproducts); the existence of the other sort results from the remarks follow- 
ing Theorem 2.1.1 below. 

For example, the category Modg of R-modules is additive, since Homr(M, N) has 
an abelian group structure for which (2.1.3) holds. On the other hand, Ens, Rg and 
Gp are not additive, for there is no way of defining an abelian group structure on the 
hom sets to satisfy (2.1.3). 

It is clear that a category with a single object is just a monoid; if the category also 
satisfies Ad.1 and Ad.2, it is a ring. Similarly in any additive category .°/, the group 
(XX) is a ring for each X € Ob.v/. 

For finite families of objects in an additive category we can define a further type of 
product, which helps to clarify the connexion between products and coproducts. Let 
7 be any additive category and X;,....X,, any .o/-objects. The biproduct of the 
family (X)....,X,), written []i, is an object B with 2n maps p,:B— X;, 
q; : X, — B, such that 


4p) = Oly > Pdi = 1; (2.1.4) 


The three kinds of product are related by 


Theorem 2.1.1. In an additive category .o/ let (X;) be a finite family of objects. Given 
an .f-object B and maps p; : B ~ Xj. q, : X; > B, such that qip; = 4;;, the following 
conditions are equivalent: 


36 Homological algebra 


(a) (B. p;) is a product of the X;, 
(b) (B. qj) ts a coproduct of the X;, 
(c) (B, p,.q,) 1s a biproduct of the Xj. 
Proof. (a) = (c). Let (B, p;) be a product and write g = )— pjq,;; then gp, : B > X; 
satisfies yp; = p;, for yp; = >_; pjgip: = pi. By uniqueness, gy = 1g and so B is a 
biproduct. | 

(c) = (a). Given f;: A — X,, we can define f:A— B by f = > fiq,. Then 
fp; = >-; fiqip; = f,, therefore (B. p;) is a product. Thus we have shown that (a) = 
(c) and by duality, (b) <= (c). |» | 


We observe that any finite product (or coproduct) in any category satisfying Ad. 1 
and Ad.2 can be completed to a biproduct in a unique way. Given a product (B, p;) 
Of Xysaies X,, fix j and define f, : X; — X, by f, = 4;;. Then there exists q,: X, > B 
such that qjp; = 6;;. This holds for 7 = 1,...., and now (B. pj. q,) is a Bipradtcs by 
Theorem 2.1.1. Thus every finite product can be completed to a biproduct in a 
unique way, and by duality the same holds for finite coproducts. So we find 


Corollary 2.1.2. Let / be a category satisfying Ad.1 and Ad.2. Then any finite family 
of /-objects has a coproduct if and only if it has a product, and the two are isomorphic. 
In particular, the product and coproduct of any finite family are isomorphic in any 
additive category. be 


Taking the empty family, we see that the initial and final object in any additive 
category are isomorphic. An object that is both initial and final is called a zero 
object. Thus every additive category has a zero object. By a zero morphism we under- 
stand a morphism which can be factored via a zero object. With this definition it is 
easily seen that in an additive category the neutral element in each hom group is the 
zero morphism. 

We also note that on writing p = (p)..... Dist =. Oineses dn)’, we can express 
(2.1.4) in the form 


Our next task is a categorical description of kernels; we begin with monomorphisms 
and subobjects. In any category (not necessarily additive) a map aw : X — Y is said to 
be monic or a monomorphism if whenever Aa. wa are both defined, then 


Aa=pa implies A=u. 
In an additive category this condition can of course be simplified to 
Aa =0 implies A= 0, 


whenever Aa is defined. In Ens the monomorphisms are just the injective mappings. 
More generally, in any concrete category injective morphisms are monic; the converse 


2.1 Additive and abelian categories 37 


holds frequently but not always. By a subobject of an object A we understand a pair 
(X,q@) such that a: X +A is monic. Two subobjects (X,q@) and (X',a@’) of a 
given object A are said to be equivalent if there is an isomorphism A: X > X’ 
such that a = Aq’. It is clear that this leads to the expected notion of subset in the 
category Ens, or subgroup in Gp. 

Dually a morphism a: X — Y is said to be epic or an epimorphism if it is left 
cancellable: 


ak =au implies A=u. 


In an additive category this can again be shortened tocA =O SA=O.Iff : A> xX 
is epic, the pair (X, f) is called a quotient object of A, and equivalence is defined as 
before. A quotient object in Ens is just a quotient set; in Gp it is a quotient group (by 
a normal subgroup). 

Let 7 be an additive category; given a map a@ : X — Y, we shall define the kernel 
of a as a certain subobject of X. Consider all maps A : A — X such that Aa = 0; for 
fixed a we obtain a category by taking these maps A as objects and as morphisms 
from A to 4’ maps @ from the source of A to that of A’ such that 4 = gd’, with 
the obvious composition rule (obtained by composing maps in .°/). A final object 
in this category, if one exists, is called a kernel of a. Thus a kernel of a is a map 
X4:A-—>X such that Aw =0 and any other map A’: A’ > X satisfying A’a = 0 
can be factored uniquely by A, i.e. we have A’ = wd for a unique map g. This can 
be expressed more briefly by saying that the kernel is the largest subobject ‘killed’ 
(i.e. mapped to 0) by a. The kernel need not exist, but if it does, it is unique up 
to equivalence, and in fact is a subobject of X. For let (A.A) be the kernel and 
assume that fA = 0; then by the uniqueness of the factorization, f = 0. The kernel 
A of @ or also the map A to X, will be denoted by ker a. 

Dually the cokernel of a: X — Y is an initial object in the category of all maps 
2: Y — C such that ay = 0. This is a quotient object of Y, unique up to equiva- 
lence if it exists; it (or also the map from Y) will be denoted by coker a. 

Given any map a : X — Y, assume that ker a@, coker @ exist. Then we can define 
two further objects, the image of a, a subobject of Y, and the coimage of a, a quotient 
object of X: 


im @ = ker coker a, coim a = coker ker a. 


Again they need not exist, but if they do, they are unique up to equivalence. Further 
we have the following diagram: 


a 
kera—-~xX — =Y- cokera 


+ ‘| (2.1.5) 


: a . 
coim @ —~> Ima 


Here coim a is the largest quotient of X killing ker a, hence there is a map 
K:coima— Y such that a = (coima)«. It follows that (coim a@)«(coker a) = 0; 
but coim is epic, so «(coker a) = 0, and since im a@ is the largest subobject of Y 
killed by coker a, there is a unique map a@’ : coim a > ima to make the diagram 


38 Homological algebra 


commute. If we had proceeded in dual fashion, starting from im @ and going via a 
map X — im @, we would have obtained another map @” : coim a — ima to make 
the square commute. Now a@ = (coim @)a‘(im a) = (coim a@)a"(im @); since coima@ 
is epic and im @ is monic, it follows that wa’ = a”, so the maps coincide and there is 
complete symmetry. In important cases a’ is an isomorphism and this suggests the 


Definition. An abelian category is an additive category .°/ such that 


Ab.1 every map in .of has a kernel and a cokernel, 
Ab.2 the induced map coim @ — im @ is an isomorphism. 


This definition can be applied to any category with a zero object, not necessarily 
additive; such categories are called exact. 

These axioms, like the others, are self-dual. An example of an abelian category is 
Ab, the category of abelian groups, or more generally, the category Mod z of right 
R-modules, for any ring R (BA, Section 4.2). By contrast the category Gp is not 
abelian (it is not even additive), and there are additive categories that are not abelian, 
such as the category of topological abelian groups and continuous homomorphisms; 
here Ab.2 need not hold, because there are continuous homomorphisms that are 
bijective but have no continuous inverse. 

In an abelian category monic and epic maps have a simple description: 


Proposition 2.1.3. In any abelian category a map a is monic if and only if ker a = 0 
and epic if and only if coker a = 0. If ker w = coker a = 0, then @ is an isomorphism. 


Proof. By definition of ker a we have Aa = 0 iff A = A‘(ker aw) for some 4’. Hence 
AX = 0 holds for all such maps iff ker a = 0. This proves the first assertion; the second 
follows by duality. If w : X — Y is such that ker a = coker a = 0, then ima = Y, 
coim a = X and a = a’ is an isomorphism. | 


We observe that this result often fails to hold in more general categories, e.g. in Rg 
the inclusion map Z — Q is both epic and monic but is clearly not an isomorphism. 
A sequence of objects and maps in an abelian category 


ay, Qy, 
ee est > A, > Anil >... 


is called a complex if a,a,, , = 0 for all n. This means that im @,, is a subobject of 
ker a, 4, for all n. If we have equality at A, : im a@, = ker a, ,), the sequence is said 
to be exact at A,,. If the sequence is exact at each object, we speak of an exact sequence. 
It is clear that this generalizes the usage introduced for modules in BA, Section 4.2. 

The simplest cases of exact sequences are 0 + A — 0, which means that A = 0 
(A is the zero object) and 0 ~ A — B — 0, which means that A & B, by Proposi- 
tion 2.1.3. The first non-trivial case is that of a short exact sequence 


(34 a SO, (2.1.6) 


In the case of modules this indicates that A‘ is isomorphic to a submodule of A, with 
quotient isomorphic to A”; thus A is an extension of A’ by A”. In an abelian category 
we take (2.1.6) as the definition of an extension; thus we call A an extension of A’ by 


2.1 Additive and abelian categories 39 


A’ when there is an exact sequence (2.1.6). A monomorphism A clearly satisfies 
dX = ker coker 4, hence in the short exact sequence (2.1.6), A = ker uw and dually, 
u = coker A. The next case is that of an exact sequence with four non-zero terms; 
this arises for example when we analyse a general map @ : X — Y and obtain the 
exact sequence in the top line of diagram (2.1.5). 

As for modules, the case of split exact sequences is important: 


Proposition 2.1.4. Given maps a: X — Y,B: Y — X in an abelian category such 
that aB = 1, the canonical composite ker B -> Y -> coker @ is an isomorphism. 


Proof. Write a’ = coker a, 8’ = ker B, so that aw’ = 0 = 'B; we have to show that 
B’a’ is an isomorphism. Since B is epic, B = coker f’; if B’a’f = 0 for some f, then 
there exists g such that a'f = Bg, by the definition of B’ as ker B. Hence 
g = apg =aa'f = 0, so a'f = Bg = 0, but a’ is epic and hence f = 0. This shows 
that B’a’ is epic; by duality it is monic and so is an isomorphism. | 


When maps aq, £ are related as in Proposition 2.1.4, @ is called a section and Ba 
retraction. 


Corollary 2.1.5. For a short exact sequence (2.1.6) in an abelian category the following 
conditions are equivalent: 


(a) A 1s a section, 
(b) « is a retraction, 
(c) AZ A'T]A" for suitable maps a: A — A’, B: A" > A. 


Proof. (a) = (c). By hypothesis there is a map @ such that Aw = 1. Put v: kera ~A 
for the canonical inclusion. Since u = coker 4, it follows from Proposition 2.1.4 that 
vu is an isomorphism and on writing B = (vy) ~ 'bA” — A, we find that pil, 
We claim that A is a biproduct of A’, A” relative to the maps A, B; a. yw. Clearly 
Au=0, Ba=0, so it only remains to show that aA+yuB=1. Write 
f=ardA+ uB-1,; then fu = uBu —-pw=O0, hence f =f’A where f'=f'Aa = 
fa=ara+ ppa-—a=a-—a=0; it follows that f=0 as claimed. Thus 
A & A'[]A’; the converse is clear, hence (a) = (c), and now (b) © (c) follows by 
duality. > | 


A short exact sequence satisfying the equivalent conditions of this corollary is said 
to be split exact. 

We recall from BA, Section 4.2 that in a category of modules, for any pair of maps 
with a common target, a: A — C, B: B > C, there is a ‘least common left multiple’ 
P with maps a’: P > B, B': P > A such that a’B = Ba and for any pair a”, B” 
such that a” 6 = B"a@ there exists y such that a” = ya’, B" = yf”. It is called the 
pullback of the triple (a, 8, C). This pullback exists in any abelian category, for, 
given a: A—> C, 8: B-— C, form the product A|] 8B with projections p, q on A, 
B respectively; now it is easily verified that ker(pa — qB) is a pullback of a, B. 
A dual construction can be carried out for the pushout of a triple (C, a, B) as 
coker (ai, Bj), where i, j are the injections of A, B (the targets of a. 6) into the 
coproduct A[[B. 


40 Homological algebra 


The following property of pullbacks was proved in BA for the module case 
(Proposition 4.2.1); we now see that it holds quite generally: 


Proposition 2.1.6. Let A be an additive category. Given a pullback diagram 
(A, B.C: P) as shown below, if ker a’ exists, then ker a exists and kera & ker a’. 
A dual result holds for pushouts. 


OS Pe 
fits Ae 


es Ae 


Proof. Write ker a’ = (K’, v’); we shall show that (K’, v’B’) is a kernel of a. In the 
first place v'B’a = v'a’B = 0; secondly, if v: K — A is such that va = 0, then the 
triangle ABC can be completed by K to a commutative square, with 0: K — B, 
hence there is a unique map A: K — P such that AB’ =v, Aa’ =0. Since 
v'=kera’, there is a unique map 4x:K— K’' such that pv’ =A, hence 
uv’ B' =v and this shows that v can be factored uniquely by vf’; therefore ker 
a = (K’, v’B’), as claimed. + | 


In particular we see that in a pullback in an abelian category, a’ is monic iff a 
is monic and dually for pushouts. Consider a commutative square, as in the above 
diagram. This corresponds to a complex 


0+P—A[]B>C 30, (2.1.7) 


where A = (B'l, aj), u = pa — gB and i, j, p, g are the natural injections and pro- 
jections of the biproduct A [] B. The square is a pullback iff P = ker(pa — qf), i.e. 
(2.1.7) is exact at P and A[] B; it is a pushout iff C = coker(B'1, aj), 1.e. (2.1.7) 
is exact at A[| 8B and C. It follows that a pullback is also a pushout whenever ju 
is epic. Suppose now that @ is epic and let v be such that wv =0. Then 
av = ipav = (pa — qB)v = inv = 0, and hence v = 0; this means that y is epic. 
Thus if in a pullback @ is epic, then we have a pushout and so by Proposition 
2.1.6, a’ is also epic. This proves 


Corollary 2.1.7. Given a pullback diagram as in Proposition 2.1.6 in an abelian 
category, if a is epic, then so 1s a’. Dually, if in a pushout diagram a is monic, then 
so is a’. B 


Exercises 


1. Show that Ens has an initial and a final object, but no zero object. 

2. Show that in Rg the inclusion Z — Q is monic and epic but not an iso- 
morphism. Is the inclusion Z — R an epimorphism? 

3. Show that in a concrete category every monomorphism is injective. 


2.2 Functors on abelian categories 4] 


4. Show that any epimorphism of groups is surjective. (Hint. If a with target G is 
not surjective, examine the maps from G to the group of all permutations of G.) 

5. Show that the pushout of two maps a: C > A, B: C — B, one of which is the 
zero map, is A ® B. 

6. Show that A’ —> A —> A” is exact iff the compositions im A + A — coim yu 
and ker 4 — A — coker A are both zero. 

7. Show that in an abelian category a map a@ is monic iff a = ker coker a, and epic 
iff w = coker ker a. 

8. Let s¥ be an abelian category; a subcategory .s7, is said to be abelian if with any 
morphism @, ker a and coker a@ (formed in .c/) lie in .c/;. Verify that .:7; is again 
an abelian category. 

9. (3 x 3 lemma in abelian categories) Given three short exact sequences whose 
second and third terms, topped and tailed by 0’s, form columns of exact 
sequences, so as to form a commutative diagram, show that there is just one 
way to fill in arrows between the first terms so as to make the diagram com- 
mutative, and the first column is then exact. 

10. (Windmill lemma) Given two rows of short exact sequences with a common 
middle term, written as a row and column, say, form the pullback of the 
NW-square, the pushout of the SE-square and factorize the SW- and NE-squares 
through their images. Show that the resulting diagram is commutative, with 
exact rows and columns. Deduce the second isomorphism theorem for abelian 
categories: A, /(A; MA) & (A; + A2)/A2. 

11. Let R be any ring. Given R-modules and homomorphisms @:A— C, 
B:B-— C, show that the pullback of a and £6 is the submodule of A®B 
given by {(x, y)|xa = yB}. Given a: C > A, B: C > B, show that their push- 
out is (A @ B)/K, where K = {(za, zB)|z € C}. 

12. Show that the pushout of two K-algebras A, B relative to the natural maps 
K — A, K — Bs their tensor product over K. 


2.2 Functors on abelian categories 


Whenever we consider functors between additive categories we shall assume that 
they are additive; here F: o/ > ZF is called additive if (a + pyr =a! + B*, when- 
ever a + f is defined; thus the mapping ./(X, Y) > BX? YY? )isa group homo- 
morphism. For example, in any additive category .«/ the hom _ functors 
h® : X13 .0/(A,X) and hy : X 1 -.2(X, A) are additive functors from ./ to Ab; on 
the other hand, the functor X i> Hom(X*, X) between vector spaces, where X* is 
the dual of X, is not additive. To give another example, an additive category with 
a single object is just a ring; now an additive functor between one-object categories 
is nothing other than a ring homomorphism, or an antihomomorphism in the case 
of a contravariant functor. Henceforth all functors are assumed to be additive unless 
otherwise stated. 

Let -/, # be any categories. We recall (from BA, Section 3.3) that between two 
functors F, G from .x/ to # a natural transformation is a family of morphisms 
yx : X* — X© such that for any »/-morphism f : X > Y we have foy = ¢xf; 


42 Homological algebra 


a natural transformation with a natural transformation as inverse is a natural 
isomorphism. 

If we apply an additive functor to a biproduct (B. p;, qj), the defining equations 
between the p’s and q’s are preserved, hence the result is again a biproduct. By 
Theorem 2.1.1 we obtain 


Proposition 2.2.1. Any additive functor acting on an abelian category preserves finite 
products, coproducts and biproducts. i | 


Clearly any functor takes zero maps to zero maps and hence transforms a complex 
into a complex. We shall be particularly interested in functors that preserve exact- 
ness. A functor T is said to be exact if it transforms each exact sequence 


A+B“ C (2.2.1) 
into an exact sequence 
i rf 7 je! ‘8 
A —-B-—-C'. (252.2) 


For example, an equivalence between categories is an exact functor. We recall from 
BA, Section 3.3 that two categories .o/,.# are equivalent if there are two functors 
T:.d > B,S:B— oa such that TS is naturally isomorphic to the identity functor 
on </, and similarly ST is naturally isomorphic to the identity on 4. Any functor 
T:.o% — & defines for each pair X, Y of .s/-objects a mapping 


(X,Y) > BX! YY"). (2.2.3) 


The functor T is called faithful if (2.2.3) is injective and full if (2.2.3) is surjective. For 
an equivalence functor T, (2.2.3) is a bijection, so in this case T is full and faithful. 
Moreover, an equivalence functor T is dense in the sense that every .«/-object is iso- 
morphic to one of the form X’, for some -s/-object X. As we saw in BA, Proposition 
3.3.1, a functor T is an equivalence iff it is full, faithful and dense. 

All this holds in quite arbitrary categories; when ./. 4 are additive (and by 
assumption T is an additive functor), (2.2.3) is clearly a group homomorphism, 
and it follows easily from this that any equivalence functor is again exact. 

However, exact functors are rare; most functors only satisfy a weaker condition. 
We define a functor to be left exact if it preserves kernels and right exact if it 
preserves cokernels. First we have a restatement of this condition. 


Proposition 2.2.2. A functor between abelian categories T: A — B 1s left exact if and 
only if the exactness of 


(aH Be (2.2.4) 


implies the exactness of 


: r i! : 
Wo Br eae. (2.2.5) 


2.2 Functors on abelian categories 43 


Similarly T is right exact if and only if it preserves exactness when the 0 in (2.2.4) is at 
the other end (1.e. when coker u = 0). 


Proof. The exactness of (2.2.4) is expressed by the equation 4 = ker yw. If T preserves 
kernels, it follows that A! = ker yx! and so (2.2.5) is exact. Conversely, if (2.2.5) is 
exact, then by applying T to the exact sequence 0 > 0 — A — B, we find that the 
sequence 0 > A’ — B? is exact, as well as (2.2.5), so A’ =ker yw’, and this 
shows T to be left exact. Similarly for right exactness. | 


Corollary 2.2.3. A functor between abelian categories is exact if and only if it 1s left and 
right exact. 


Proof. Clearly an exact functor is left and right exact; conversely, if a functor T is left 
and right exact, it preserves kernels and cokernels, hence | mags and colmapes: oy 
hypothesis im A = ker yx in (2.2.1), hence im A’ = (im r)' = (ker yu)! = ker u!, 
so T is indeed exact. Fs | 


We note that (2.2.1) is exact iff the sequence 
0—- imA—> B—> coim pz > 0 
is exact. Thus if T transforms short exact sequences into short exact sequences, then 


it is exact. The converse is clear, so we have 


Corollary 2.2.4. A functor between abelian categories 1s exact if and only if it preserves 
the exactness of short exact sequences. a 


So far all functors were tacitly assumed to be covariant. If T : -/ > 4 is a contra- 
variant functor, we shall call T left exact if the covariant functor op.T : .</° > BF is 
left exact. Right exact contravariant functors are defined correspondingly, by the 
right exactness of op.T. The reason for this form of the definition (rather than 
using T.op : ./ — 8°) is to be found in 


Theorem 2.2.5. For any abelian category .o/, the bifunctor .o/(X, Y) 1s left exact in each 
argument, i.e. Ir, hy are each left exact. 


Proof. For any w:Y— Y” in «f the kernel of the induced mapping 
A(X, ): A(X. Y) > o/(X, Y”) is the set of morphisms killed by yu, Le. the 
maps that factor uniquely through ker jz. Thus 


ker .o/(X, wu) = .o(X, ker pz), 
hence h* is left exact, as claimed. Similarly, for any 4: X’ > X in W/, 
ker (A, Y) = W(coker A, Y), 


therefore hy is left exact. a 


Let «/ be any category. A functor F from A to Ens is said to be representable if 
there is an .</-object P such that X* = .o/(P, X); in other words, F is then naturally 


44 Homological algebra 


isomorphic to h? and one also says that F is represented by P. When the category .°/ is 
abelian, we can similarly define the representability of a functor from .° to abelian 
groups. A contravariant functor G from .c/ is called representable if there is an 
J -object Q such that Y° = .e/(Y, Q), thus G is naturally isomorphic to hg. For 
example, the dual of a vector space is representable, almost by definition: 
V* = Hom,(V,k). To give another example, consider U(R), the group of units of 
a ring R. It can be shown that this functor is representable by the infinite cyclic 
group Z, thus U(R) = Mon(Z, R), where Mon is the category of monoids and R is 
considered as multiplicative monoid. 

Sometimes. . we shall need a criterion for a functor to preserve inexact sequences; 
a sequence —> —> is called inexact if it is not exact, ie. if im A #~ ker py. 


Proposition 2.2.6. A functor T between abelian categories preserves inexact sequences if 
and only if it is faithful. 


Proof. Suppose first T preserves inexact sequences; we must show that T is faithful, 

aF 0 implies a’ 40. Given @:A— B, where a 40, fe sequence 
- 225 e358 as inexact, hence it remains so on applying T, i.e. ker a’ 4 A’ and 
soa! £0. 

Conversely, assume that T is faithful and consider the sequence (2.2.2). If this is 
exact, then (Au)! = Aw = = 0, hence Aw = 0. Now let ker p = (B’.i) and consider 
the composition B’ ee ee Gs ~ is zero, nee so is the result of applying T 
and it gives nse to a map (ker z)’ > ker yx’. Likewise there is a map coker 
A —> (cokerA)’, and the sequence 


(ker 2)’ — ker xp’ — B’ — coker A’ — (coker A)’ 


is exact at B’; hence the composition (ker ui) —» B’ - (coker A)’ is zero, and it 
follows that the sequence (2.2.1) is exact at B. Ea 


There is a useful test for exactness in the case of adjoint functors. Given two 
functors T: </> 2.8: B—- «sf, we call </, 8 an adjoint pair, or more precisely, 
Sa left adjoint and T a right adjoint if for any ./-object X and #-object Y, 


A(Y°,X) = BY,X"), (2.2.6) 


where in the case of additive categories & is an isomorphism of abelian groups which 
is natural in X and Y. The notion of an adjoint pair can of course be defined in quite 
general categories; then (2.2.6) is merely a bijection of sets (still natural in X and Y). 
For example, if U : Gp — Ens is the forgetful functor from groups, associating with 
each group its underlying set, then 


Gp(Fy, G) & Ens(X, G* ), 


where Fy is the free group on X. Generally nearly every universal construction arises 
as the left adjoint of a forgetful functor. To give another example, if 1: Ab — Gp is 


2.2 Functors on abelian categories 45 


the inclusion functor and ab : Gp — Ab is abelianization, i.e. passing from a group 
G to G*? = G/G’, the universal abelian image (see BA, Section 3.3), then 


Ab(G”, A) & Gp(G, iA). 


The typical construction described by a right adjoint singles out a subset by some 
closure operation; see for example Exercise 8. 

Returning to the general case of an adjoint pair (2.2.6), we observe that each of S, T 
determines the other up to natural isomorphism, for if we had 


BIY.X') = BY.X’ ), (22:7) 


let us first take Y = X/ and denote by a : X' — X! the map on the right of (2.2.7) 
corresponding to the identity map on the left; next take Y =X! and let 
B:X! -— X! be the map on the left corresponding to the identity map on the 
right. Then @B = 1y:, Ba = 1,;', so @ is a natural isomorphism. 

We also note that the hom functor as a bifunctor is faithful. Taking for example, 
hs Xt .0/(A, X), we have for a: X > Y, h® : Ai Aa, thus h” is right multi- 
plication by a, and choosing A = 1, we find that Aw = 0 for all A implies a = 0; 
similarly for hy. With these preparations we have 


Theorem 2.2.7. Let S and T be a pair of adjoint functors between abelian categories of 
and 2. Then the left adjoint S is right exact and the right adjoint T is left exact. More- 
over, if <1, B have arbitrary products and coproducts, then T preserves products and S 
preserves coproducts. 


Proof. Let us apply (2.2.6) to a short exact sequence 
Ot 
We obtain a commutative diagram of complexes 
CS XS eae) 
se = | ~ (2.2.8) 
O—> BUY. X'") > BIY.X") > BY, X"") 


By Theorem 2.2.5 the top row is exact, hence so is the bottom row, and this arises by 
applying the functor h’ to the sequence 


‘eo ew Gee oe Gee (2.2.9) 


But h', when ¥ is allowed to vary, is faithful and so preserves inexact sequences; since 
the bottom row in (2.2.8) is exact, so is (2.2.9). This proves T to be left exact, and 
it preserves products, by (2.2.6). A dual argument shows S to be right exact and 
to preserve coproducts. cl 

Let 6 be an abelian category and I a partially ordered set, regarded as a small cate- 
gory. We denote by @/ the functor category whose objects are functors from ! to 6 
with natural transformations as morphisms; thus the objects are families of % -objects 


46 Homological algebra 


indexed by J, with families of 6-maps as morphisms. Explicitly a @-object (Aj, a@j;) 
consists of 6-objects A; with @-maps a,j; : Aj —> Aj; for 1 <j, such that 


af = i. Oj jQjk = Aik (1 <j < k). (2.2.10) 
Conditions (2.2.10) are called the coherence conditions anda family satisfying them is 
said to be coherent. A morphism f : (Aj, @;;) > (B;. Bij) is a family of maps 


f, : A; — B; such that ajjf; = fiB,; for i <j. 
We have the diagonal functor 


A:6>6'. (2.2.11) 


which with each @-object A associates the constant family (A,,a;;) with Aj = A, 
@;; = 1. The adjoint functors of A play an important role. The direct limit (also 
called the inductive limit or colimit ) lim _. is defined as the left adjoint of A: 


6(lim —(A,. @;;), B) = 6'((A,, a), A(B)). (2.2312) 


If = lim (A;) exists, there are maps u; : A, > L satisfying uj = aj; for i < j such 
that any map (f,) from (Aj.a@,;) to A(B) can be factored uniquely by (u,), thus 
f, =u, for all 1 € J, for a unique A: L > B. 


For example, when / is totally unordered, the direct limit reduces to the co- 
product. If J consists of three points i, j, k with i < j,1 < k, we obtain the pushout. 
Let us describe direct limits for modules. The construction is simplified if we assume 
I to be a directed partially ordered set (i.e. given i,j € I, there exists A > i, j); in that 
case the family (Aj. a;,) is also called a direct family. The direct limit L is the direct 
sum of the A; modulo the submodule generated by the elements x — xa@,,(x € A,) for 
1 <j. For an example from field theory, let F be a field and (E,) the family of all 
extensions of F of finite degree, with inclusions as mappings; then it is clear that 
we have a direct family. Its direct limit is a field Q containing F, which is algebraic 
over F and algebraically closed, and hence is the algebraic closure of F (see BA, 
Section 7.3 and Section 11.8). 

Similarly the inverse limit (also the projective limit or simply limit) lim. is defined 
as the right adjoint of A: 


©(B. lim. (A,. @j;)) = @'(A(B). (A;. @,;)). 


2.2 Functors on abelian categories 47 


B - C rid 
— 
J 
The inverse limit C has maps v, : C > A; such that vja@;; = vj and any map (f;) from 


A(B) to (A;, @j;) can be factored uniquely by (¥,), thus f; = wv; for all i € I, for a 
unique wz: BC. 

When I is totally unordered, this reduces to the product of the A;. For a triple 1, j, k 
with 1 < k, 7 < k it becomes the pullback. To describe the construction for modules 
we take | inversely directed and refer to (Aj. @;;) as an inverse family. The inverse 
limit of such a family of modules is obtained by forming the product [] A; and 
taking the submodule of all elements (x;) such that xja;; = x,. 

To illustrate the notion of inverse limit consider a free group F of rank > 1. Let us 
write (N,) for the family of all normal subgroups of finite index in F. Then G, = F/N, 
is a finite group and for N; C N, we have a natural homomorphism 4,; : G; > G;, 
and these homomorphisms are coherent. Since the intersection of any two of the 
N, is again of finite index, we have an inverse family and we can form the inverse 
limit G = lim. (G,). This group G is called a profinite group (as projective limit 
of finite groups). Since the natural homomorphisms F — G; are compatible with 
the y,,, we have a canonical homomorphism y: F — G. As we shall see in 
Section 3.4, ON; = 1, and it follows easily from this fact that y is injective. However, 
y is not surjective, for G, as inverse limit, is uncountable, whereas F is countable 
whenever its rank is at most countable. A similar construction is possible for abelian 
groups, thus for example Z can be embedded in a profinite group, or even in a pro- 
p-group (see Exercise 10). 

The last example illustrates another important point, namely the lack of duality in 
general module categories. As we have seen, the notion of an abelian category can be 
developed in an entirely self-dual manner. However, the category of all modules over 
a given ring is not self-dual, except for very special rings, so in order to describe 
module categories axiomatically one will need axioms whose duals may not hold. 
We shall not carry out the full axiomatization (which can be found in most books 
on category theory) but merely list one axiom holding in all module categories 
but not always for their duals. This is Grothendieck’s 
AB5 Axiom. Given a chain of subobjects (A,) and any subobject B of an object, we have 


(UA;)N B= UA; B). (2.2.13) 


The following equivalent form will be more convenient for us: 


I, The functor lim_, is exact. 


48 Homological algebra 


Since lim_. is in any case right exact, as left adjoint, this requires that for any families 
of exact sequences 0 > A; — B; the sequence 


0 — lim_. (A;) > lim_. (B;) 


should again be exact. Like (2.2.13) this condition is easily verified in any category of 
modules. The dual states 


[°. The functor lim is exact. 


By duality we need only verify that the exactness of A; > B; — 0 entails that of 
lim. (A;) > lim  (B;) > 0. In most module categories this does not hold. For 
example, taking A; = Z and B; the family of finite images, we have lim (Aj) — Z, 
but the mapping Z > Z = lim. (B;) is far from surjective, since the limit Z is 
uncountable. 

In BA, Section 4.7 we have already met projective and injective modules. Their 
counterpart in abelian categories is of importance, because it can be used to 
remedy the lack of exactness of the hom functor (Theorem 2.2.5). 


Definition. Let ./ be an abelian category. An .%-object P is called projective if the 
covariant hom functor h” = .o/(P,—) is exact; an .c/-object I is called injective if 
the contravariant hom functor h; = «/(—,I) is exact. 

An alternative description of projective objects is given in 


Theorem 2.2.8. Let P be an object in an abelian category ./. Then the following 
conditions are equivalent: 


(a) P is projective, 
(b) every short exact sequence 


(Ke PS Pa (2.2.14) 


with P in third place splits, 
(c) given a diagram with exact row as shown, there exists a map P — B to make the 
triangle commutative. 


P 


m4 


, 
BB 50 


Condition (c) may be expressed by saying: every map from P to a quotient of B may 
be lifted to B. We note that the statement of this theorem is quite similar to that of 
Theorem 4.7.4 of BA for modules, but we shall not be able to use the proof given 
there, which depended on the existence of free modules. On the other hand, the 
proof given below provides another proof of Theorem 4.7.4 of BA. 


2.2 Functors on abelian categories 49 


Proof. (a) = (b). Given a short exact sequence (2.2.14), we find by (a) that the 
sequence of abelian groups 


0 > oo/(P,A) > </(P.B) > .V/(P,P) > 0 


is exact. Now lp € .c/(P, P) and by exactness there exists B € .e/(P, B) such that 
But = Ip, hence (2.2.14) splits. 
(b) = (c). By forming the pullback of the given diagram we obtain 


ker a—> C —> P 
Pho 
B— B"—>0 
By Corollary 2.1.7, @ is epic, hence by (b) the top row splits, so there is a map 


f :P— Cand ff: P— Bis the required map. 
(c) => (a). Given a short exact sequence, we apply .o/(P. —) and obtain 


0— o/(P. B’) > <(P.B) > V/(P.B") > 0. (2.2.15) 


By the left exactness of hom this can fail to be exact only at .7(P, B”). But by (c) 
every map P > B” lifts to a map P — B, and this means that (2.2.15) is also 
exact at .o/(P, B”). Pe 


Of course there is a dual characterization of injectives: 


Theorem 2.2.9. Let I be an object in an abelian category /. Then the following con- 
ditions are equivalent: 


(a) I is injective, 
(b) every short exact sequence 


QO—-I-B-—-C—-0 


with I in first place splits, 
(c) given a diagram with exact row as shown, there is a map A — I such that the 
resulting triangle is commutative. 


Q—+ A’—+A 


s 
I 
Here (c) may be expressed by saying: every map from a subobject of A to I can be 
extended to A. 
The proof is dual to that of Theorem 2.2.8 and so may be left to the reader. EJ 


Although the notions of projective and injective module are dual, they can have 
very different appearance in actual categories and we shall return to this question 
for module categories in Section 2.3 and Section 4.6. 


50 Homological algebra 


Exercises 


1. Show that a functor between additive categories is additive iff it preserves finite 
products. 

2. Use Exercise | to show that a functor between additive categories forming part 
of an adjoint pair is necessarily additive. 

3. Show that a subcategory of an abelian category is abelian iff the inclusion functor 
Is exact. 

4. Show that for a faithful functor T in an abelian category, C 4 0 implies C’ + 0. 
Show that for an exact functor this condition is sufficient as well as necessary. 

5. Let T: oS > #, S:B-— a be a pair of functors giving an equivalence of 
categories. Show that S, T is an adjoint pair as well as T, S. 

6. Show that for any abelian category the following are equivalent: (a) every object 
Is projective, (b) every object is injective, (c) every short exact sequence splits. 

7. Let S, T be a pair of adjoint functors between abelian categories. Show that if S 
is left exact, then T preserves injectives; if T is right exact, then S preserves 
projectives. 

8. For any group G denote by ZG the group ring of G over Z. Show that the 
correspondence Gt ZG is a functor from Gp to Rg whose right adjoint is 
the functor Ri+U(R), where U(R) is the group of units of R. 

9. Show that a functor .o/ > & is full and faithful iff .c/ is equivalent to a full 
subcategory of .F. 

10. Let p be a prime number. Verify that Np"Z = 0 and deduce that there is a 
natural injection Z — lim. (Z/p"). 


2.3 The category Mod, 


We have already seen that the category Mods of all right R-modules, for any ring R, 
is abelian. Frequently R will be a K-algebra (associative, with 1) , where K is some 
commutative ring; in that case the hom sets Homa(M, N) are K-modules and not 
merely abelian groups. We shall say that we have a K-linear category in that case. 
A functor F between K-linear categories is required to be not merely additive but 
also K-linear: 


(a+ BY =ab +p. (aa)! =Aaa*® eK). 


As a rule K will be an arbitrary commutative ring, fixed in any given context, and all 
rings will be K-algebras. The case of abstract rings is included by taking K = Z. 

We recall that a right R-module structure on M can be described by saying that we 
have a homomorphism 


f:R— Endx(M). (2.3.1) 


Similarly, a left R-module structure on M corresponds to an antihomomorphism 
(2.3.1), ie. a homomorphism R’ — Endx(M) from the opposite ring; in detail 


2.3 The category Mod, 51 


this is a K-linear mapping f such that (xy)f = yf.xf. This remark is often used to 
avoid having to pass to the opposite ring. Thus if we have a homomorphism 
R° — Endg(M), we shall regard M as a left R-module rather than a right R°-module. 

Let R, T be any rings and ;Modsg the category of (T, R)-bimodules; clearly this is a 
subcategory of Moda. The following lemma on the transport of ring action is often 
useful. 


Lemma 2.3.1. Let R, S, T be any rings (or K-algebras) and F : Modrp ~ Mods a 
covariant functor. Then F induces a functor F' : ;Modg — +Mods. Similarly a contra- 
variant functor G: Modr — sMod induces a functor G' : 7Modp — sModyz. 


Proof. Given a (T, R)-bimodule M, we know that M* is an S-module; further, for 
any t € T we can define the action of t on M! as t". Since t defines an endomor- 
phism of Mp, t* defines an endomorphism of (M')«, ie. an element of Ends(M*), 
and so 


F 


(xa)t’ = (xt")a foranyxe M', aeS. 


We claim that the rule trot’ defines an antihomomorphism of T into Ends(M). 
For if t,t’ € T, then (tt’)" =1'?.t*, because M is a left T-module; hence M’ is 
indeed a (T, S)-bimodule. Moreover, a homomorphism a between (T. R)-bimodules 
may be characterized as an R-homomorphism centralizing T, hence a’ is an 
S-homomorphism centralizing T, i.e. a homomorphism between (T, S)-bimodules. 
The second part, referring to G, is proved similarly; since G is contravariant, 
it defines a homomorphism of T this time, which means that M® is an (S, T)- 
bimodule. Gg 


As an example, important for what follows, consider the hom functor. Let M be an 
(S. R)-bimodule and N a (T, R)-bimodule; we shall express this briefly by saying that 
we are in the situation (sMr. 7Nr). Consider H = Hom,(M, N); when we regard M, 
N as right R-modules, H is just an abelian group (or a K-module). But the left 
T-module structure on N induces a left T-module structure on H, while the left 
S-module structure on M induces a right S-module structure on H; here the side is 
reversed because Hom(M,N) is contravariant in M. Thus we see that H is a left 
T-, right S-module; in fact it is a (T, S)-bimodule. To show this let us write (f, x) for 
the effect of f € H onx € M. Then by definition we have for any re R, s € S,t € T, 
[oR EAE AE) Cis 0) — Ch Se), (Of) St e).. Hence (Cif )s.0) (7 se) 
t( f.sx) = t(fs.x) = (t( fs). x), 1e. (tf)s = t(fs), as claimed. 

A second functor of great importance is the tensor product. We recall from BA, 
Section 4.8 that for a K-algebra R and modules (U,,.,V) there is a K-module 
U @x V with a mapping 


A:UxV— U @p V. 


which is universal for K-bilinear mappings f from U x V to K-modules that are 
R-balanced, i.e. such that 


(xry)f =(xcnyf forall xeU,re Vi re Rr. 


52 Homological algebra 


We remark that (assuming the tensor product over the commutative ring K as 
known), U @p V may also be obtained as the homomorphic image of U @x V by 
adding the relations xr @y =x @ry(r € R). Further we recall the equations of 
adjoint associativity which follow from the definition of the tensor product. For 
the situation (QUr.pVs. pWs) we have the natural isomorphism of (T, Q)- 
bimodules (adjoint associativity) 


Homs(U @pr V. W) = Homep(U, Homs(V, W)). (23.2) 


This may be expressed by saying that — @pr V is the left adjoint of the functor 
hY =Homs(V.—). By symmetry the same holds for U @g-— in the situation 
(SUR. RV. sWo), using the isomorphism 


Hom.s(U @p V. W) = Homep(V, Homs(U, W)). (2.3.3) 


By Theorem 2.2.7 we conclude 


Proposition 2.3.2. For a left R-module V over any ring R, the tensor product functor 
— @pr V is right exact and preserves direct sumis; similarly for right R-modules. | 


Further we recall the associative law for tensor products; in the situation 
(Up.p Vs.y W) we have 


U @r (V @s W) = (U Or V) @s W. (2.3.4) 
We also recall the identity 
U@rpR=U for U: (2.3.5) 
it corresponds to the well-known identity for the hom functor 
Homey(R, U) = U. (2.3.6) 


We have already met projective and injective objects in the category of modules in 
BA, Section 4.7. In particular, we see from the characterization given there that 
the projective R-modules are precisely the direct summands of free R-modules. 
There is no such explicit description of injective modules (but see Section 4.6 
below); for the moment we note that by Theorern 2.2.9 a module M is injective iff 
every short exact sequence with M as first term splits, i.e. iff M is a direct summand 
in every module containing it as a submodule. This leads to the following criterion. 
An extension of modules MCW is called essential and M is said to be a large 
submodule of N if M has a non-zero intersection with every non-zero submodule 
of N. 


Proposition 2.3.3. An R-module is injective if and only if it has no proper essential 
extension. 


Proof. Suppose that M is injective. If M is contained as a submodule in N, then it is a 
direct summand, so if N 4 M, the extension is not essential; thus M has no proper 
essential extension. Conversely, assume that M has no proper essential extension, 
and let L be any module containing M as a submodule. The family of all submodules 


2.3 The category Mode 53 


of L meeting M in 0 is clearly inductive and so, by Zorn’s lemma, has a maximal 
member, Ly say. Consider L = L/L); since ML, = 0, M maps isomorphically to 


a submodule M of 1, and by the maximality of L, L is an essential extension of 


M = M. So it cannot be proper, hence L = M, ie. M+L1y)=L and MNIy = 0, 
so M is a direct summand in L. It follows that M is injective, as claimed. EB 


We have already met Reinhold Baer’s injectivity criterion in BA, Theorem 4.7.7; 
here is another very short proof, due to Peter Freyd. 


Theorem 2.3.4 (Baer’s criterion). For any ring R, a left R-module M is injective if and 
only if every homomorphism from a left ideal of R into M can be extended to a homo- 
morphism from R to M. 


Proof. The necessity is clear; to prove the sufficiency of the condition we show that 
when it holds, M has no proper essential extension. Let M C L bea proper extension, 
fix « € L\M and consider the pullback diagram shown, where R — L is the map 
rim ru. 


P—>R 
of i 
ML 


By Proposition 2.1.6, the map P + R is monic, so P is isomorphic to a left ideal of R 
and by hypothesis the map P — M extends to a homomorphism R — M. If lio vy 
in this homomorphism, we have x'> xv = xu for all x € P, and xe P if xe M, 
hence R(#—v) is a submodule of L such that R(u—v)NM=0, but 
R(u — v) #0, because 1 € M, u ¢ M, so uy. Thus L is not an essential extension 
of M and since L was arbitrary, it follows by Proposition 2.3.3 that M is injective. 


With every homomorphism of rings there are several transfer functors associated, 
which are often useful. Given any rings R, S and a homomorphism f : R > S, any 
right S-module U may be defined as an R-module by putting 


x.a=x(af) forxe U.aeR. 


This R-action on U is said to be defined by pullback along f (not to be confused with 
the pullback diagram in Section 2.1), and the resulting R-module is written /U. The 
correspondence Ui 'U is a functor from Mods to Modg rather like the forgetful 
functor. We shall want to go in the opposite direction and construct an adjoint; 
thus we are given an R-module A and we ask for an associated S-module. There 
are two constructions, arising as the left adjoint and the right adjoint of the functor 
'U; they are known as the change-of-rings constructions. 


Proposition 2.3.5. Let R, S be rings and f : R > S a homomorphism. Given a right R- 
module A, there is a right S-module Ay = A @r S left adjoint to 'U: 


Hom,(A;. U) & Homap(A./ UV), (2.3.7) 


54 Homological algebra 


and a right S-module A‘ = Homg(S. A) right adjoint to !U: 
Hom,(U. A!) = Homp(! U, A). (2.3.8) 


Moreover, there is a map a: A — Ay which induces a homomorphism of R-modules 
(from A to !(A;)) and a map B: A! — A inducing a homomorphism of R-modules 
(from !(A!) to A). If the R-module structure on A arose by pullback along f from an 
S-module V, then V is a direct summand of A; and of Al, as R-modules. 


Proof. The proof is a simple verification, using (2.3.2), (2.3.5) and (2.3.6): 


(1) Homs(A @e S$. U) = Homp(A. Homs(S, U)) = Home(A. VU), 
(ii) Homs(U, Hompe(S.A)) = Homp(U @s S.A) = Homp(U.A). 


Nowa: A — A @x Sis just the map a4i> a © 1 and 6 : Homy,(S, A) — A is the map 
pi ly. 

If A ='V, we put U = V in (2.3.7), (2.3.8) and consider the image of the identity 
map on the right of (2.3.7), (2.3.8); this provides maps Ay > V, V > A! which 
together with @, f respectively define a splitting of Ay, A’ respectively. Bi 


The module A; is called the induced and A! the coinduced extension of A along f. 
Here the variance refers to S, not A; in fact, both are covariant in A. They are some- 
times called relatively projective and relatively injective, on account of the following 


property: 


Corollary 2.3.6. If A is projective as R-module, then A; is projective as S-module; if A is 
injective as R-module, then A! is injective as S-module. 


Proof. The first part follows because Homs(A;. —) is exact whenever Homp(A, —) is 
exact, by (2.3.7); similarly the second part follows by (2.3.8). a 


An abelian category is said to possess enough projectives if every object can be writ- 
ten as a quotient of a projective object. For example, the category Mod, of right 
modules over any ring R has enough projectives, because every module is a homo- 
morphic image of a free (hence projective) module, by BA, Theorem 4.6.3. Dually, an 
abelian category is said to have enough injectives if every object can be embedded as a 
subobject of an injective object. For example, Z as Z-module is contained in Q which 
is injective. Let us show that Mod, has enough injectives. 


Proposition 2.3.7. Let R be any ring. Then Moda (as well as xMod) has enough injec- 
tives, 1.e. every R-module can be embedded in an injective R-module. 


Proof. We first take the special case R = Z. Every abelian group A can be written as a 
quotient of a free abelian group: A & F/N. Now F is a direct sum of copies of Z and 
by embedding Z in Q we can embed the abelian group F in a vector space over Q, G 
say. Clearly G is divisible as Z-module and hence so is G/N, and it contains F/N = A 
as a submodule. But for a Z-module ‘divisible’ is the same as ‘injective’ (by BA, 
Proposition 4.7.8}, so the Z-module A has been embedded in an injective Z-module. 


2.3 The category Mode 55 


Consider now the general case. Given any ring R, there is a natural homo- 
morphism f : Z— R, obtained by mapping ni—~n.1, and we can consider any R- 
module M as Z-module by pullback along f. By what has been proved, M can be 
embedded in an injective Z-module I, hence the coinduced extension 
M! = Hom;z(R. M) is a submodule of I, by the left exactness of Hom. By Proposi- 
tion 2.3.5, M is a direct summand of M/, hence it is an R-submodule of I‘, and I! is 
injective as R-module by Corollary 2.3.6. oi 


It is possible to go beyond Proposition 2.3.7 and describe the ‘least’ injective 
module containing a given module: 


Theorem 2.3.8. Let R be any ring. Given R-modules M, E, where M C E, the following 
conditions are equivalent: 


(a) Eis a maximal essential extension of M, 
(b) Eis a minimal injective module containing M. 


Such an extension E exists for any R-module M, and if E’ is another extension of M 
satisfying (a) and (b), then there is an isomorphism from E to E’ leaving M elementwise 


fixed. 


Proof. (a) = (b). If E is a maximal essential extension of M, then any essential exten- 
sion F of E is an essential extension of M, for any non-zero submodule of F meets E 
and hence M non-trivially. By maximality we have F = E, so E has no proper essen- 
tial extensions and is therefore injective, by Proposition 2.3.3, but any submodule 
of E containing M has E as essential extension and so cannot be injective unless it 
is the whole of FE, again by Proposition 2.3.3. Hence E is a minimal injective 
module containing M. 

(b) => (a). Assume that E is a minimal injective module containing M and let F 
be any essential extension of M; we claim that F can be embedded in E. For the 
inclusion of M in E extends to a homomorphism f : F — E because E is injective, 
and Af kerf =0, hence ker f = 0, because F is an essential extension of M. 
Thus F is embedded in E. If F is a maximal essential extension of M, then as we 
have just seen, we can take F to be a submodule of E and by the first part of the 
proof F is injective. It follows that F is a direct summand of E, and so, by the mini- 
mality of E, we have E = F, as we had to show. 

We can always construct such an E by taking an injective module | containing 
M (Proposition 2.3.7) and inside J taking a maximal essential extension of M, 
using Zorn’s lemma. Finally, if E, E’ are two modules both satisfying (a) and (b), 
then the identity mapping on M extends to a homomorphism a: E — E’ by the 
injectivity of E’. The kernel of a meets M in 0, hence ker a = 0 by (a), so im @ is 
an injective submodule of E° and hence im @ = E’ by (b). This shows @ to be an 
isomorphism. | 


The module E in Theorem 2.3.8, first constructed by Eckmann and Schopf in 
1953, is called the injective hull of M. Although E is determined up to isomorphism 
by M, this isomorphism is not unique and the correspondence of E to M is not a 


56 Homological algebra 


functor. Later, in Chapter 4, we shall find that over certain rings there is a dual 
notion of projective cover for every finitely generated module. 


Exercises 


1. For any ring R show that the finitely generated right R-modules and homo- 
morphisms form a full subcategory of Modg. Is it abelian? What about the 
full subcategory of cyclic right R-modules? 

2. Prove the rules (2.3.4)—(2.3.6) in detail. 

3, Let (A,) be a family of objects in an abelian category. Show that if [] A, exists 
then it is injective iff each A; is injective; likewise, if |] A; exists, then it is 
projective iff each A; is projective. (Warning. Exact sequences need not be 
preserved under products or coproducts in general abelian categories. ) 

4. An object P in a category .</ is called a generator of .<of if h? = ./(P, —) is faith- 
ful. Show that a generator in Modg is faithful as R-module, i.e. any non-zero 
element of R defines a non-zero action. 

5. Show that in an abelian category with arbitrary coproducts an object P is a 
generator iff every object is a quotient of a copower of P. Deduce that an abelian 
category with arbitrary coproducts and a projective generator has enough 
projectives. 

6. Dualize Exercises 4 and 5 to show that an abelian category with arbitrary 
products and an injective cogenerator (i.e. h; = A(—.I) is exact and faithful) 
has enough injectives. 

7. Show that R is a generator of Modp. More generally, show that M is a generator 
of Mods iff R is a direct summand of "M for some n > 1. 

8. Let f : R — S be a ring homomorphism. Show that for any modules Up, 5V we 
have Uy @s V & U @p Ve, 

9. Show that for f:R-—->S as in Exercise 8 and modules Up, pV we have 
'(U;) @QpVZzU @pri (Vr). Show further that when R, S are commutative, 
then U; Bs Vy =(U @p Ve 

1). Show that if M is a finitely generated module over a Noetherian ring R, then 
M* = Homa(M, R) is again finitely generated. 

11. Show that in Baer’s criterion (Theorem 2.3.4) it is enough to test all large left 
ideals. 


2.4 Homological dimension 


For any abelian group A we can write down a presentation A & F/K, where F is a 
free abelian group and K as subgroup of F is also free. For modules over a ring 
we still have such a presentation, but now K need no longer be free, nor even pro- 
jective. If we take the view that projective modules are particularly simple (justified 
in the sequel), we may next take a presentation of K and hope that the process 


2.4 Homological dimension 57 


terminates, i.e. that K is in some sense closer to being projective than A. Our first 
objective is to assign a numerical value to this lack of projectivity. 

Given any R-module A, we take a projective module Py mapping onto A. The 
kernel Kg need not be projective, but we can again take a projective P; mapping 
onto Ko. This map has kernel K, and we can continue the process, giving rise to a 
commutative diagram with exact row as follows: 


i, —— a hg: e 
oe ll i 
As a rule one omits the kernels and just writes the exact sequence 
.2> P; > Pi Pi > Pp me A? 0. (2.4.1) 


This is called a projective resolution of A. For example, let R = k{x. y] be the poly- 
nomial ring in x, y over a field k and consider k as R-module, by pullback along 
the natural homomorphism R — R/(x. y) =k. We resolve k by mapping R to 
R/(x, y); the kernel is the ideal (x, y) and we next take the map R — (x, y) defined 
by (a. b) > ax — by. Its kernel is the set (cy, cx), c € R, and this is isomorphic to R. 
Thus we have obtained a resolution 


ORS RS RSS ba) (2.4.2) 


As a second example, take R = Z/4, A = Z/2. Clearly A is not projective, but we 
have a homomorphism R— A consisting of multiplication by 2, with kernel 
2Z/4 = A, hence we obtain an infinite resolution 


ho Ro A 0: (2.4.3) 


It is clear that a projective resolution, possibly infinite, exists for any module, because 
Moda has enough projectives, and the resolution terminates when we reach a pro- 
jective kernel. In order to compare different resolutions of a given module we 
need Schanuel’s lemma. It is useful to have this in an extended form. 


Proposition 2.4.1. Let R be any ring and M an R-module. Given two short exact 
sequences 0 > A> P> M — 0 and0 > B— Q— M — O, where P is projective, 
we have an exact sequence 


0>- A> P@BQ— 0. (2.4.4) 


Proof. If we form the pullback of P + M, Q > M and recall Proposition 2.1.6 and 
Corollary 2.1.7, we obtain an exact commutative diagram, where A’ = A, B’ & B: 


58 Homological algebra 


0 0 
/ / 
Aw->A 
L J 
0— B’>C>» P->0 
1 4 y 
0-B-~Q->-M—0 
y J 
0 0 


Since P is projective, the middle horizontal sequence splits and C= P@®B' = PO@OB. 
Now the middle vertical (with A’ replaced by its isomorph A) is the desired exact 
sequence (2.4.4). = | 


If Q as well as P is projective, (2.4.4) splits and we obtain 


Lemma 2.4.2 (Schanuel’s lemma). Given two short exact sequences as in Proposition 
2.4.1, if P and Q are projective, then PBB QOA. ee 


This result suggests the following definition. Two modules M, N are called projec- 
tively equivalent if there exist projectives P, Q such that 


P@MZAQAON. 


It is clear that this is in fact an equivalence relation; we shall denote the class of M by 
[M] and note that [M] = 0 iff M is projective. 

On the set of all equivalence classes we can define an operation as follows. Given 
M, we resolve it by a projective: 


0O—-A—- P>M—0, 


and write m(M) = [A]. By Schanuel’s lemma the class [A} depends only on M, not 
on the resolution chosen. If we replace M by M @ Q, where Q Is projective, we have a 
resolution 


0- A~ P@Q- MGQ— 0, 


and this shows that 2(M) depends in fact only on the class [Af] and not on M itself; 
m is sometimes called the loop fimnctor. 
We can now define the homological (or projective) dimension of a module M as 


hd(M) = min{n|2"~'(M) = 0}. where 2°(M) = [M]. 


This depends only on the class of M; the definition shows that hd(M) < n iff M has 
a projective resolution of length < n, i.e. of the form (2.4.1) with P; = 0 for 7 > n. 


2.4 Homological dimension 59 


The global dimension of a ring R is defined as 
gl.dim.(R) = sup{hd(M)|all Mp}. 


If necessary, we shall distinguish the right and left global dimensions, formed from 
right or left modules. In general these two numbers may be distinct, although they 
coincide for Noetherian rings (see Section 2.6 below). The rings of global dimension 
0 are just the semisimple rings, for they are the rings for which every module is 
projective (BA, Theorem 5.2.7). As an example of a ring of infinite global dimension 
we have the ring Z/4, as the resolution (2.4.3) shows. 

There is an analogous development using injective resolutions, based on the fact 
that Mod, also has enough injectives (Proposition 2.3.7). Given an R-module M, 
we form an injective resolution 


O- MoIlh- hwoh..... (2.4.5) 


by embedding M in an injective module I), then embedding the cokernel in an 
injective 1; and so on. The dual of Schanuel’s lemma shows that in a short resolution 
0+ M—-I-—L-— 0, the ‘injective class’ of L (defined like the projective equiva- 
lence class) depends only on that of M, and so may be written .(M). The least integer 
n such that 1"*+'(M) =0 is called the cohomological or injective dimension of M, 
written cd(M). As before we can define the corresponding global dimension of R, 
but as we shall see in Section 2.6 below, this agrees with the global dimension defined 
in terms of homological dimension. In the special case of global dimension 0 this is 
already clear from the definition of a semisimple ring as a ring over which any short 
exact sequence splits. 

To illustrate these ideas we shall examine the class of rings of global dimension 1. 
We shall need a lemma which may be regarded as the dual of Baer’s injectivity 
criterion (Theorem 2.3.4). 


Lemma 2.4.3. An R-module P is projective if and only if every homomorphism from P 
to a quotient of an injective module I can be lifted to I itself. 


Proof. The condition is necessary by Theorem 2.2.8. Conversely, assume that it 
holds. Given a short exact sequence 0 — A — B — P — 0, we embed B in an injec- 
tive module / and form the pushout: 


OAS Bee Pp 


jn 


> 


Oi eA es fae ee) 


Since f is monic, the pushout is a pullback, by the dual of the argument following 
Proposition 2.1.6. Now by hypothesis there is a map 6: P + I such that B = 6a, 
therefore by the pullback property there is a map A: P— B such that Aa = 1, 
AB = 0. Thus the given sequence splits and this shows P to be projective. | 


60 Homological algebra 


Theorem 2.4.4, For any ring R the following conditions are equivalent: 


(a) every quotient of an injective right R-module 1s injective, 
(b) every submodule of a projective right R-module 1s projective, 
(c) every right ideal of R is projective. 


Proof. (a) = (b). Given the diagram 
P «—— P= 0 


ae 


f= 0 


where P is projective, we have to fill in the map P’ — I to produce a commutative 
triangle. By Lemma 2.4.3 we may assume that / is injective, and then I” is injective, 
by (a). We can therefore fill in P > I", and then P — I (because P is projective). 
Now the composition P’ +> P — I provides the required map. 

(b) = (c) is trivial and the proof of (c) = (a) is dual to the first part, using the 
diagram below. 


Oe a eK 


, G a 


v 


i Seal ee) 


A ring satisfying the conditions of this theorem is said to be right hereditary. For 
example, any principal ideal domain (commutative or not) is right (and also left) 
hereditary. From (b) it is clear that the right hereditary rings are just the rings of 
global dimension at most 1, and (a) shows that the same class is obtained by com- 
puting the global dimension from injective resolutions. 

In the commutative case hereditary integral domains have another more illumi- 
nating description. We recall from BA, Section 10.5 that an ideal a in a commutative 
integral domain R is called invertible if there is an R-submodule b of its field of frac- 
tions K such that ab = R. In BA, Proposition 10.5.1 we saw that an ideal is invertible 
iff it Is non-zero projective; in particular such an ideal must be finitely generated. In 
fact a commutative hereditary domain is precisely a Dedekind domain, as the 
description of the latter in BA, Section 10.5 shows. 


Exercises 


1. Given two projective resolutions (P,), (P;) of finite length of a module M, show 
thatP) BP, BP. @... =P, GP, OP, @... (extended Schanuel lemma). 

. Find the global dimension of Z/n. (Hint. Take first the case of a prime power.) 

3. Let R be a right hereditary Noetherian ring and M a finitely generated sub- 
module of R’, as right R-module, for some set J. Show that M is a direct sum 
of modules isomorphic to right ideals of R, and hence is projective. (Hint. 
Among the direct summands of M isomorphic to a direct sum of right ideals, 
pick a maximal one.) 


i) 


2.5 Derived functors 61 


4, Show that if R is as in Exercise 3 and M a finitely generated right R-module, 
then M=M)@a,@®...@a,, where the a; are right ideals of R and 
My = {ker f |f : M — Ry}. 

5. Let R be a ring such that every right ideal is free as right R-module. Show that 
every submodule of a free right R-module is free (a ring with IBN in which every 
right ideal is free is called a right fir, see Section 8.7 below). Show that if R is also 
commutative, it must be a principal ideal domain. 

6. Show that a left fir (defined by symmetry as in Exercise 5) which is right 
Noetherian is also a right fir, in fact right principal (see Section 8.7 below). 

7. Show that if R and S are right hereditary rings and U is any (RK, S)-bimodule, 


R U 
projective as right S-module, then the triangular matrix ring e F is again 


right hereditary. 
8. Show that Q is not projective, as Z-module. Deduce that the triangular matrix 


Z 
ring (| 54 is right but not left hereditary. 


9. Show that over a commutative integral domain R, every quotient of a divisible 
module is divisible. Deduce that R is a Dedekind domain iff every divisible 
R-module is injective. 

10. In the ring R[ sin @, cos 6] show that the ideal generated by sin 6 and 1 — cos is 
projective but not principal, and hence not a direct summand (this ring Is a 
Dedekind domain, see BA, Section 10.5). 


2.5 Derived functors 


We have seen that the hom functor and tensor product are only left and right exact 
respectively, and we shall now describe a way of measuring this lack of exactness. The 
method is quite general and applies to any functor which is left or right exact. The 
basic idea of the construction is as follows. Let F be a functor which is right exact 
covariant, say. Given any module A, we take a projective resolution 


1 OX... OX, OO XP Am O. 
and apply F: 
2 FX, 2 ... > FX, —~ FX, — FA > 0. 


In general this will no longer be exact, but it still is a complex. From any such com- 
plex one can form homology groups H,,(A) described below, which measure the lack 
of exactness of F; taking the X, projective ensures that these groups depend only on 
A and F, but not on the choice of the resolution X. 

Before we enter on the actual construction, we need some properties of commu- 
tative diagrams. These are true in any abelian category, but we shall only consider the 
case of modules, where they can be verified by diagram-chasing. As a matter of fact, 


62 Homological algebra 


most of the results then follow for general abelian categories, because every small 

abelian category has an exact embedding into a module category (see Mitchell 

(1965) and Further Exercise 12 of Chapter 4), but we shall not make use of this fact. 
Given any commutative square I: 


we define the image ratio of Ias 1(1) = (im y Nim 4)/im( 6) and the kernel ratio of I 
as k(I) = ker(B5)/(ker a + ker B). We note that if y or 6 is monic, then i(I) = 0, 
and if @ or B is epic, then k(I) = 0. The ratios of two adjacent squares are related by 


Lemma 2.5.1 (Two-square lemma). Given a commutative diagram with exact rows: 


2 ft 
—> — 


ort gape de by 


A ji 


we have i(I) & k(II). 
Proof. We must show that 


ima MimB  ker( Bu’) 
imap ——iker B+ ker pe 


Clearly im BN im A’ = im BN ker pw’ = {xB|xBu' = 0} = (ker Bu')B, and im AB = 


(im A4)B = (ker w)B = (ker B+ ker w)B. Now both ker fu’ and ker B+ ker pu 
contain ker B, hence by the third isomorphism theorem, 


- ~ kerb iker(Bu Bo. 
sO ker B+ker je (kerB+kerp)p eh & 


Lemma 2.5.2. Given the commutative diagram with exact rows, 


yeaa» eee gear 
Lo LH Ved 


(at aes 


the following diagram is commutative with exact rows and columns: 


2.5 Derived functors 63 


ker o i ker B ker y 
| d 
A B C O 
| a [ | 
O | cies C' 
| rn. | 


coker o —— coker gol coker y 


Here 2*, * are the maps induced between the kernels and X'., 4), are the maps induced 
between the cokernels. Moreover, if 4 is monic, then so is A* and if w' is epic, so is [). 
The proof, by diagram chasing, is straightforward and may be left to the reader. J 


Lemma 2.5.3 (Snake lemma). Given the diagram in the hypothesis of Lemma 2.5.2, 
there exists a homomorphism A : ker y ~ coker @ such that the sequence 


A’ je A * as 
ker a —> ker 8 —> ker y —> coker w —> coker 6B — coker y 
1s exact. 


Proof. (J. Lambek) We have to prove exactness at ker y and at coker a, for a suitable 
A, and for this it is enough to show that coker * & ker AL. Writing X = coker pt", 
Y = ker A’, we have the following commutative diagram with exact rows and 
columns: 


O 


X 
ker y ——~ X 
O 


ker B O 
| 3 
A B C 
2 <) i 4 i" 
O A' B' C’ 
7a 
Y ———+ coker o ——~ coker B 
8 | 
Y O 


64 Homological algebra 


By the 2-square lemma, X = i(1) = k(2) = 1(3) = k(4) = 1(5) = k(6) S1(7) & 
k(8) & Y. | 


We can now return to the task of constructing the homology of a complex. It 
will be convenient to treat a special case first, that of a differential module. By a 
differential module X we understand an R-module X with an endomorphism d of 
square zero: d~ = 0. If we regard these modules as objects of a category Diffe, the 
maps in the category are taken to be homomorphisms preserving the structure, 
i.e. the homomorphisms f : X — Y such that the square shown commutes: 


eee 

ae 

oy 
These maps are traditionally called chain-maps. The condition d- = 0 means that 
im d C ker d. We shall write im d = B, ker d = C and call the elements of B bound- 
aries and those of C cycles. Finally H = C/B is called the homology group of X. (For an 
excellent concise indication of the geometrical background, see Mac Lane (1963).) 


In general B and C and hence H = H(X) are merely abelian groups, but when R is 
a K-algebra, they are K-modules. We have 


Theorem 2.5.4. For any K-algebra R, H : Diffp — Mod 1s a covariant K-linear func- 
tor from differential modules to K-modules. 
The proof is a straightforward verification, which may be left to the reader. 


We observe that a complex may be regarded as a special case of a differential 
module. To see this we need only replace modules by graded modules (regarding 
R as a graded ring concentrated in degree 0). The type of complex we encounter 
will generally be a graded module with an antiderivation, also called a differential, 
of degree r= 1 or —1. In the case r = —1 one speaks of a chain complex, in the 
case r = | of a cochain complex. Thus given a chain complex 


Gy, | dy 
> X, — Xy--) 2.2 2X, > XO, (2.5.1) 


we can regard this as a graded differential module and the homology group is again a 
graded module; in detail we have 


H,,(X) = ker d,/imd,,; (7 =0,1....). 


Here dp is taken to be the zero map, thus Hy(X) = Xo/im dj. It is clear that the 
complex given by (2.5.1) is exact precisely when H(X) = 0, so that H(X) may be 
taken as a measure of the lack of exactness of X. We note further that if in (2.5.1) 
each X,, for n> 1 is projective, then (2.5.1) is a projective resolution of the 
R-module A iff 


A forn=0O, 
0 forn 40. 


2.5 Derived functors 65 


A complex (X,,) satisfying (2.5.2) is said to be acyclic over A. 

We Shall state the next few results for differential modules rather than complexes 
(= graded differential modules), for the sake of simplicity. The change to complexes 
can easily be made by the reader. 

By Theorem 2.5.4 an isomorphism of differential modules induces an isomorph- 
ism of homology groups, but since structure is lost in passing to homology, one 
would expect a much wider class of mappings to induce isomorphisms. The appro- 
priate notion is suggested by the topological background. 


Definition. Two chain maps f,g:X — Y of differential modules are said to be 
homotopic: f ~ g, if there is an R-homomorphism s: X — Y, called a homotopy, 
such that 


s.dy +dx.s=f —g. (2.5.3) 


It is clear that this relation between chain maps is an equivalence; moreover it has 
the desired property: 


Proposition 2.5.5. Homotopic chain maps induce the same homology map, i.e. if 
f ~g, then H(f ) = H(g). 


Proof. By linearity it is enough to show that if f + 0, then H(f) = 0. If c € C(X), 
then cd = 0 and since f =sd+ds, it follows that cf = csd + cds = csd € B(Y). 
Thus C(X)f ¢ B(Y) and so H(f) = 0. pe 


It is clear how this definition and proposition have to be modified if X, Y are 
graded, say both are chain complexes. For f, g to be homotopic they must be of 
the same degree r say, and then the homotopy s will have degree r + 1. 

A chain map between differential modules, f : X — Y is said to be a chain equiva- 
lence or homotopy equivalence if there is a second chain map f': Y — X such that 
ff' = lx, ff © ly. This is easily seen to be an equivalence relation between chain 
maps and now Proposition 2.5.5 yields 


Corollary 2.5.6. [ff : X — Y isa chain equivalence, then H(f ) isan isomorphism. 


We now come to a basic property of short exact sequences of differential modules: 


Theorem 2.5.7. Given a short exact sequence of differential modules: 


Mo. O25. oe oh: 


there exists a homomorphism A : H(X”) — H(X’) natural in X, such that the 
triangle 


66 Homological algebra 


H(X) 


ail Ne 
ay 


H(X’) H(X") 


is exact. 


A is known as the connecting homomorphism. 


Proof. The exact sequence X may be written 


Or Ss Gee 1 


1 L 1 
O—- X' —» XK — x"->0 
Ld Ld La 
0O—> xX’ —> X — x">0 
1 1 1 


X'/B' —> X/B—> X"/B" > 0 


where C and X/B are the kernel and cokernel of d respectively. By Lemma 2.5.2 the 
whole diagram is commutative, with exact rows and columns. Now the map 
d:X — X induces a map X/B— C, because Xd C C and Bd = 0, and this map 
has both kernel and cokernel equal to H = C/B. Hence we have a commutative 
diagram 

H’ eee H pas H” 

1 y 1 

X/B' —> X/B —> X"/B" = 0 
{ { 1 


OO SSC = C* 


J uy 1 
H’ — H —> H’” 


which has exact rows and columns, by Lemma 2.5.2. Further, by the snake lemma, 
there is a homomorphism A: H"” - H’ which makes the homology triangle 
exact, and which from its derivation is natural in X. = | 


The important case of this theorem is that where X’, X. X” are in fact graded 
modules, usually chain or cochain complexes, with maps of degree zero between 
them, and with d of degree —1 or 1 respectively. In the case of chain complexes, 
say, the exact triangle takes on the form of an infinite sequence 


.— H,(X') > H,(X) > H,(X") > H,,- (X') > ... > Hy(X") > 0, 


2.5 Derived functors 67 


which is called the exact homology sequence associated with the short exact 
sequence X. 

Any R-module may be trivially regarded as a chain complex concentrated in 
degree 0, 1e. My) = M,M,, = 0 for n 40, and d = 0. In the sequel all chain com- 
plexes will be zero in negative dimension, i.e. X, = 0 for n < 0. The complex X is 
said to be over M if there is an exact sequence X — M — 0 (regarding M as a trivial 
chain complex in the way described above). In full this sequence reads 


> Xy > XX) OX POM 0. (2.5.4) 


If this is exact, it is called a resolution, or also an acyclic complex; then H,,(X) = 0 
for n > 0, Ho(X) = M. If each X is projective, we have a projective resolution. An 
important property of projective resolutions is that they are universal among resolu- 
tions of M. This follows from the more general 


Theorem 2.5.8 (Comparison theorem). Given two complexes 
X > M0 

L ¢ 

MoM 536 


where X is projective, X’ is a resolution of M' and y is a homomorphism, then there 
exists a chain map f :X — X° such that the resulting diagram commutes, and f is 
unique up to homotopy. 

We shall also say that f is over y or that f lifts ¢. 


Proof. We have to construct f,,:X, — X,, such that f,d'=df,-; (n> 1) and 
foe’ = ey. We construct these maps recursively, using the fact that X,, is projective 
and im(df,,-,) C im d’ (by the exactness of X’). At the n-th stage (when f,,_, has 
been constructed) we have the diagram 


~_X), 
oe | af, | 
fe 
X,——+ im d’——+ 0 
and this can be completed by a map f,, : X,, > X|, because X,, is projective; this still 
applies for m= 0, taking ¢,g in place of d, f,_,;. To show that the map f so 


constructed is unique, suppose that f +h is another map lifting gy; then hf lifts 0 
and we have to find s, : X, — X* ,, such that 


, f 
Sd a5 dS); — | a ee Sod ae ho. 


The construction needed is quite similar to the one just carried out and may be left 
to the reader. i 


68 Homological algebra 
In particular, if we have two projective resolutions of M, we can lift the identity on 


M to a map of complexes which is unique up to homotopy, hence we obtain 


Corollary 2.5.9. Any two projective resolutions of a module M are chain equivalent and 
hence give rise to isomorphic homology groups, for any functor. «| 


We now have all the means at our disposal for constructing derived functors. 
Theorem 2.5.10. Let F be a covariant right exact functor on Modrg. Then there exist 
functors F,(n = 0,1,...) such that 
(i) FyuM & FM, 

(ii) F,P = 0 for n > 0 if P is projective, 
(il) to each short exact sequence of R-modules 
A: 0>~A>A >0. 
there corresponds a long exact sequence 
y A / 
oe FA ad fA FA =e Fy,-1A = 


F,-;A 7 ...2 FA’ > FAO FA" > 0. (2.5.5) 


where the connecting homomorphism A is natural in A. Moreover, F,, 1s determined up 
to natural isomorphism by (1)—(111). 


Proof. We begin by proving the uniqueness. Given any short exact sequence 
0O-K-P-A-— 0. (2.5.6) 
where P is projective, we have for any F,, satisfying (i)—(ili), the exact sequence 
0— F.A— FK > FP > FA — 0: 


thus any exact sequence (2.5.6) with P projective determines FA. If F)A is obtained 
from (2.5.6) and F}A from 


OK’ > PSA 0: 


where P' is again projective, we form the pullback Q of P — A and P’ > A and 
apply F. We thus obtain the commutative diagram: 


2.5 Derived functors 69 


O 
| 
O F4A O 
J 1 | 
O FK’ FK' O 
, 3 | 2 | 
O FK FQ FP' O 
} 5 | 4 [| 
O F4A FK ———+ FP ———+ FA O 
ee ee 
O O O O 


The row and column meeting in FQ are split exact, because they arose by applying 
F to split exact rows and columns (split because P and P’ are projective). The 
remaining rows and columns are also exact, and by the 2-square lemma we have 
FiA = i(1) & k(2) = i(3) = k(4) & i(5) & FA. This shows that F,A is determined 
up to isomorphism by (i)—(iii). For » > | the short exact sequence (2.5.6) yields 
the long exact sequence 


.2 F,P— F,A- Fy,2\K - F,.\P—-.. 


and F,,P = F,,_ ;P = 0 by (11). Thus in terms of the loop operator mz introduced in 
Section 2.4 we have the formula 


Jarra SS pitty); (2.5.7) 


this makes sense since F is constant on projective equivalence classes for n > 0, by 
(11). Now the uniqueness follows by induction on n. 

It remains to prove the existence of F,,; here we may use any projective resolution 
for A, say X — A — 0. Applying F, we get a complex FX — FA — 0. In detail this 
reads 


2 > FX,, =? FX,,— | <P eae FX, => FXy > FA — 0. 
We put F,,A = H,,(FX) and assert that this satisfies (i)—(ii1). 


(1) We have X; > X) — A — 0 and by the right exactness of F obtain the exact 
sequence 


FX, > FX, — FA — 0: 
hence FA & FXy/im FX, = Ho (FX). 


70 Homological algebra 


(ii) If A is projective, we can take as our resolution 
0-~A->A— DO: 


applying F, we find 0 + FA — FA — 0, hence F,,A = 0 for n > 0. 

(iii) Given a short exact sequence A as in (iii), we take projective resolutions X', X" 
of A’ and A” and construct a projective resolution X of A by induction on n. 
If the kernels at the (n — 1)-th stage are K’,, K,,, K’, we have 


2? 


poe | 


Q—+- K’ —+ K, ——+ K” —~ 0 


where X,, = X' ®X". Since X” is projective, we have a map X/, > K,,, while 
the map X|, — K,, arises by composition (via K'). By definition of X,, as a 
direct sum (i.e. product) we obtain a map X, — K,, to make the squares 
commute, and a simple diagram chase shows that this map is epic (this also 
follows from the 5-lemma). By induction on n we thus have a resolution X 
of A. Now the row 


ea Xa Xk SX S00 


is split exact, by definition; applying F, we obtain the exact sequence of 
complexes 


0 > FX’ > FX > FX” > 0, 


and Theorem 2.5.7 provides us with the exact homology sequence, which is the 
required exact sequence. Ee 


The functors F,, constructed in Theorem 2.5.10 are called the (left) derived functors 
ot F. The same result, appropriately modified, gives a construction of right derived 
functors of a left exact covariant functor, using injective resolutions. Any given 
module A can be embedded in an injective module Ip by Proposition 2.3.7; by 
embedding the cokernel in an injective module I, and continuing in this fashion, 
we obtain an injective resolution 


0-A>PI-I->.... (2.5.8) 
For any left exact covariant functor F we have a series of functors F” such that 


(i) F°A = FA, 

(ii) F"l =0 for n > 0 if I is injective, 

(iii) for each short exact sequence A as in Theorem 2.5.10 there is a corresponding 
long exact sequence 


637" es Phe PA es. ce PIO Pe Pe Oo) 


where the connecting homomorphism A is natural in A, and F” is determined up to 
isomorphism by (i)—(ili). 


2.5 Derived functors 71 


The proof is exactly analogous to that of Theorem 2.5.10. We note that here the 
index appears as a superscript, and its value increases along the sequence, whereas in 
(2.5.5) the index is a subscript, which decreases as we go along the sequence. 

For a contravariant functor the roles are reversed; if F is left exact, we define F by 
means of a projective resolution and obtain a long exact sequence (2.5.9), while for a 
right exact contravariant functor F we define F by an injective resolution and obtain 
the long exact sequence (2.5.5). To sum up, a projective resolution is needed for a 
right exact covariant and a left exact contravariant functor, and an injective resolu- 
tion for a left exact covariant or a right exact contravariant functor. 

Of course the construction of Theorem 2.5.10 can be carried out for any functor, 
not necessarily right (or left) exact. In general we obtain in this way the left derived 
functor of F; similarly we can form the right derived functor (using an injective reso- 
lution) and together they form a long exact sequence extending in both directions 
(see Exercise 9). 


Exercises 


1. Show that if I is a pullback or pushout square, then i(J) = 0 and k(I) = 0. 

2. Show that the category of chain complexes and chain maps is an abelian 
category. 

3. Let M be a finitely presented R-module, i.e. with a resolution 


0O- Go>~ F>-M-— QO, 


where F is free and both F and G are finitely generated. Given any short exact 
sequence 


O0O-A>B>M-— 0, 


where B is finitely generated, show that A is finitely generated. (Hint. Use 
Proposition 2.4.1.) 

4. Verify that homotopy is an equivalence between chain maps. 

5. Show that if f * g is a homotopy of chain maps, where f,g:X — Y and 
h:Y — Z, then fh ® gh; likewise ef © eg fore: U > X. 

6. Prove Theorem 2.5.4. 

7. (5-lemma) Given a commutative diagram with exact rows 


—-> >: ->-—Ss—— > 


hed tel. hel. ‘dey = 4B 


> > Ol 


show that if f\. fo, fy. f are isomorphisms, then so is f;. More precisely, if f, is 
epic and f,, fs are monic, show that f; is monic, and dually. Deduce that fs is 
an isomorphism whenever fj, fs are isomorphisms, f> is epic and fy 1s monic. 
8. Verify that the isomorphisms F,, > F' between two functors satisfying (i)—(1il) 
of Theorem 2.5.10 are compatible with A. 
9. For any covariant functor F on modules, F,,M (n > 0) is defined as in the proof 
of Theorem 2.5.10 in terms of projective resolutions of M. Show that there is a 


72 Homological algebra 


natural transformation Fy) M — FM and that Fo is right exact; if now F"M is 
defined similarly in terms of an injective resolution, there is a natural trans- 
formation FM — F°M and F° is left exact. Hence obtain an exact sequence 
of derived functors (F,,. F” are called the left and right derived functors of F, 
respectively). 

10. Let R be a ring with IBN and M an R-module with a finite resolution (F;) by free 
modules of finite rank. Show that the integer x¥(M) = >> ( —1)'rkF; depends 
only on M and not on the resolution; it is called the Euler characteristic of M. 
Given a short exact sequence A of modules with finite resolutions by free 
modules of finite rank, show that x(A) = x(A’') + x(A”). Define x for a complex 
C and show that y(H(C)) = x(C). 


2.6 Ext, Tor and global dimension 


The most important functors to which the construction of Section 2.5 has been 
applied are A ® B and Hom(A, B). These are bifunctors, i.e. functors of two argu- 
ments, and it is possible to form derived functors in two ways, by resolving either 
the first or the second argument. We shall find that the results in these two cases 
are the same; this is based on a general criterion which we shall now derive. 

Let F(A, B) be any bifunctor, covariant right exact in each argument, say. Then F is 
said to be R-balanced or simply balanced if F(A, —) is an exact functor whenever A is 
projective and F(—, B) is exact whenever B is projective. The same definition applies 
if F is contravariant left exact in either argument, while for a covariant left exact or a 
contravariant right exact argument, ‘projective’ is replaced by ‘injective’. 


Theorem 2.6.1. Let F be a bifunctor, covariant right exact in each argument, and 
denote by F |, F. the derived functors obtained by resolving the first and second argu- 


? 


ment of F respectively. If F is balanced, then 
F'(A, B) © F"(A. B). (2.6.1) 
by an isomorphism which is natural in A and B. 


Proof. By the uniqueness of derived functors (Theorem 2.5.10) we need only take 
F' (A,B) for fixed A and verify that it satisfies the conditions of Theorem 2.5.10 
as a functor in B. 


(1) F (A.B) = F(A, B) by definition, 
(ii) If B is projective, then F(—, B) is exact. We apply this functor to a projective 
resolution of A: 


X->A-— QO, (2.6.2) 
and obtain 


F(X, B) — F(A, B) > 0. 


2.6 Ext, Tor and global! dimension 73 


This is still a resolution, by the exactness of F(—, B). Hence F(A, B) = 0 for 
n > 0, by the definition of F’ (—. B). 
(111) Given a short exact sequence 


0—> B'’>B->B"> 0, (2.6.3) 


and a projective resolution (2.6.2) of A, we apply F(X, —) to (2.6.3). Since 
F(X, —) is exact, we obtain an exact sequence of complexes: 


0 — F(X, B’) > F(X. B) > F(X, B") > 0. 


From this the long exact sequence is obtained by applying Theorem 2.5.7. J 


A similar argument applies when there is a change of side or variance. 

Let us apply these results to Hom(A, B). This is left exact covariant in B and 
left exact contravariant in A; moreover it is balanced. By Theorem 2.6.1 the derived 
functor may be obtained either by a projective resolution of A or by an injective 
resolution of B. It is written 


Ext,(A,B) or Ext’(A, B). 


To account for the name we shall briefly indicate an interpretation of Ext. Given two 
R-modules A, B over a ring R, an extension of A by B is a module E together with a 
short exact sequence 


0O-A-E->B- 0. (2.6.4) 


We can form the category Ex, whose objects are short exact sequences, with the 
obvious maps between them: a morphism is a triple of homomorphisms making 
the diagram 


0O-A—- E— B- 0 


4 4 { 


0+-A—-E' —-B ->0 


commutative. We observe that if the maps A — A’ and B > B' are isomorphisms, 
then so is E — E’, by the 5-lemma, and we then have an isomorphism of extensions. 
From the extension (2.6.4) we form the exact homology sequence 


0 + Hom(B, A) > Hom(B, E) + Hom(B, B) —> Ext!(B. A) > ... 


Consider the image in Ext'(B, A) of the identity map j on B: jA. This is called the 
obstruction or the characteristic class of the extension. Clearly it depends only on the 
isomorphism type of the extension (2.6.4). Moreover, it is zero iff (2.6.4) splits, for 
jA = 0 iff j is induced by a homomorphism B — E, which is just the condition for 
(2.6.4) to split, by Corollary 2.1.5. 

We could also apply Hom(—. A) to the sequence (2.6.4) and get 


0 — Hom(B, A) > Hom(E, A) > Hom(A, A) —> Ext!(B, A) > ... 


74 Homological algebra 


This will give the same obstruction; in fact we have a bijection from the set of 
isomorphism classes of extensions of A by B to Ext'(B, A). We shall return to this 
topic in Section 3.1. 

The homological dimension of a module was defined in Section 2.4 in terms of the 
loop functor 7; we now show how to express it in terms of Ext. For simplicity we 
shall not distinguish between the class [7A] and a module in it. 


Proposition 2.6.2. For any R-module A over any ring R the following conditions are 
equivalent: 


(a) hdA <n, 
(b) Ext‘(A, ~) = 0 for all k > n, 
(c) Ext"? !(A,—) = 0. 


Proof. For any k > 0 and any R-module B we have, by (2.5.7) and its dual, 
Ext(7A, B) = Ext**!(A, B) = Ext“(A,1B) for k> 0. (2.6.5) 
Now (a) states that 7”A is projective; so in that case we have for any k > n, 
Ext*(A. —) = Ext*"'(9A, —) =... = Ext’ ""(x"A, -) =0 
and (b) follows. Clearly (b) = (c), so assume (c). Then 
ExtOr Ay) Ext Gr Agee) aE An) 0: 


this means that Hom(z"A, —) is exact, i.e. 7"A is projective, so (a) holds. a 
In the dual situation we can assert a little more: 


Proposition 2.6.3. For any R-module B over any ring R the following are equivalent: 


(a) cd B<n, 

(b) Ext*'(—,B) =0 forall k > n, 

(c) Ext"*'(-,B)=0, 

(d) Ext"t!(C.B) = 0 for all cyclic modules C. 


The proof that (a)-(c) are equivalent is entirely analogous to that of Proposition 
2.6.2 and it is clear that (c) implies (d). Conversely, assume (d) for right modules 
say; then for any right ideal a of R, 


Ext'(R/a,¢"B) = Ext"*'(R/a, B) = 0, 
hence the short exact sequence 
0>->a->R-> R/a- 0 
leads to the exact sequence 
0 — Hom(R/a, «"B) —~ Hom(R,2"B) — Hom(a,1"B) > 0. 


This shows that any homomorphism a — "B is obtained by restriction from a 


2.6 Ext, Tor and global dimension 75 


homomorphism R —> ."B. By Baer’s criterion (Theorem 2.3.4) it follows that :"B is 
injective, 1.e. (a). ai 


From Proposition 2.6.2 we see that the global homological dimension of R may be 
defined as sup {n|Ext” 4 0}, while Proposition 2.6.3 shows that this determines the 
global cohomological dimension. Hence we have 


Corollary 2.6.4. For any ring R, the (right) global homological and cohomological 
dimensions are equal, and may be defined as 


r.gl.dim(R) = sup{n|Ext"(C. B) 4 0 for cyclic C and any B} 
i.e. sup(hd C) for cyclic right R-modules C. o 


Of course it must be borne in mind that the global dimension defined here refers 
to right R-modules, and in general it will be necessary to distinguish this from the 
left global dimension, |.gl.dim(R). As we shall soon see, for Noetherian rings these 
numbers coincide, but we shall also meet more general examples where they differ 
(see Exercise 8 of Section 2.4). 

We now turn to the tensor product. The functor A @ B is covariant right exact in 
each argument and is balanced. We therefore have a unique derived functor, some- 
times called the torsion product, written 


Tor;(A,B) or Tor,(A, B). 
As an example consider the case R = Z. Here Tor, =0 for n> 1, because Z is 


hereditary and so all projective resolutions have length at most 1. Writing C; for 
the cyclic group of order k, we have an exact sequence 


00> Z aun ZC, > 0, 
where k indicates multiplication by k. Tensoring up with C;, we get 
0 — Tor, (Ci, Ch) ~ CG, PZ C, @©Z Ch @G, — Oz 


Denote a generator of C, by c then under the induced map 1@k we have 
CcCQlirec@ki ck @1=0. Therefore ker(1 @ k) = C, and we find 


Tor? (Cy, Cy) & Cy. (2.6.6) 


Since Tor, like @, preserves direct sums, (2.6.6) together with the equation 
Tor,(C;,C,) = 0 for coprime h, k is enough to determine Tor for any finitely 
generated abelian group. 


A right R-module A is said to be flat if A @ — is an exact functor, or equivalently, 
if Tor}(A. —) =0. For example, any projective module is flat. A corresponding 
definition applies to left R-modules. 

Using Tor, we can define another dimension for modules, the weak dimension. 
This is defined, for a right R-module A, as 


wd A= sup{n|Tor, (A, —) #0}. 


76 Homological algebra 


In terms of the loop functor z we can also define wd A as the least integer k such that 
m*A is flat. Now the weak global dimension of R is defined as 


w.gl.dim(R) = sup{n|Tors H O}. 


From the definition (and the symmetry of Tor) it is clear that this function is left- 
right symmetric. To compare it with the global dimension, let us consider the situa- 
tion (Ar 7zBr7C) and define a natural transformation 


A ®g Hom7(B, C) —-— Homz(Homap(A, B), C) (2.6.7) 
by mapping the element a ® f, where a € A, f € Homz(B, C), to 
pla, f):@1>(6a)f, where 6 € Homg(A, B). 


This map is clearly R-balanced and biadditive, and hence defines a homomorphism 
(2.6.7), which is easily seen to be natural. Moreover, for A = R it is an isomorphism, 
hence it is an isomorphism for any finitely generated projective R-module A. We 
now fix C to be Z-injective, i.e. divisible and consider the two sides of (2.6.7) as a 
functor in A. Both sides are covariant right exact; if we apply them to a projective 
resolution (2.6.2) of A we obtain a natural transformation 


Tor*(A, Homz(B, C)) —- Homz(Ext)(A, B), C). (2.6.8) 


Suppose now that R is right Noetherian and A is finitely generated. Then each term X 
in the resolution (2.6.2) may be taken to be finitely generated and so (2.6.8) will be 
an isomorphism. Thus we have proved 


Proposition 2.6.5. Let R be a right Noetherian ring and C a divisible abelian group. 
Then for any right R-modules A, B such that A is finitely generated, we have 


Tor} (A, Homz(B, C)) = Homz(Ext,(A, B),C) forall n> 0, (2.6.9) 
by a natural isomorphism. i 
We shall use this result to compare the weak and homological dimensions. In the 
first place, any R-module A satisfies 
hd A>wdaA, (2.6.10) 
because every projective module is flat. It follows that for any ring R, 
w.gl.dim(R) < r.gl.dim(R), |-gl.dim(R). (2.6.11) 


Now assume that R is right Noetherian and A is finitely generated over R. Choose n 
such that n < hd A; then Ext,,(A, B) 4 0 for some R-module B, and moreover, we 
can find a divisible group C into which Ext}(A, B) has a non-zero homomorphism 
(indeed an embedding, by Proposition 2.3.7). Thus for suitable B, C the right-hand 
side of (2.6.9) is non-zero; looking at the left-hand side, we deduce that wd A > n. 
Hence equality must hold in (2.6.10) and we have proved 


2.6 Ext, Tor and global dimension 77 


Theorem 2.6.6. If R is a right Noetherian ring, then for any finitely generated right 
R-module A, wd A = hd A. 


Corollary 2.6.7. For any right Noetherian ring R, 
r.gl.dim(R) = w.gl.dim(R) < l.gl.dim(R). (2.6.12) 


Proof. By Corollary 2.6.4 the right global dimension is the supremum of the projec- 
tive dimensions of the cyclic right R-modules, and by Theorem 2.6.6 this cannot 
exceed the weak global dimension of R; by (2.6.10) it cannot be less, and so the 
equality in (2.6.11) follows. The inequality follows similarly from (2.6.10), bearing 
in mind that the weak global dimension is left-right symmetric. a 


If R is left and right Noetherian, we obtain by symmetry, 


Corollary 2.6.8. In a Noetherian ring R, 
r.gl.dim(R) = |.gl.dim(R). «| 


Exercises 


1. Show that if Ext}(A, C) = 0 for all cyclic modules C, then the same holds for 
all finitely generated modules C. Do the same for Ext'\(C, A) and Tor*(A, C). 

2. Verify that two extensions of R-modules A by B are isomorphic iff they 
correspond to the same element of Ext;(B, A). 

3. Show that Tor preserves direct sums in both arguments, while Ext preserves 
direct products in the second argument and converts direct sums in the first 
argument to direct products. 

4. Verify that the two ways of defining the characteristic class of an extension agree. 

5. Let F be a bifunctor from R-modules which is covariant right exact and preserves 
direct sums. Show that F is balanced iff F(R, —) and F(—, R) are exact. Similarly 
if F is contravariant and convert direct sums to direct products. 

6. For any abelian group A denote by 1A its torsion subgroup. Show that 
Tor? (Q/Z, A) & tA. 

7. Show that Tor4( C,,, C,) = Cz, where d is the highest common factor of h and k. 

8. Show that for any abelian groups A. B. Tor,(A, B) = Tor, (tA, B); hence calculate 
Tor, (A. B) for two finitely generated abelian groups. 

9. For any abelian group A show that Ext'(C,.A) A/nA, by applying 
Hom(-—, A) to a suitable short exact sequence. Deduce that any extension of 
finite abelian groups A by B splits if A. B are of coprime orders. 

10. Show that (2.6.7) is not always an isomorphism. (Hint. Use Theorem 2.6.6 and 
Exercise 8 of Section 2.4.) 


78 Homological algebra 


2.7 Tensor algebras, universal derivations and syzygies 


We recall from BA, Section 6.2 that for any K-module U over a commutative ring K 
we can form a tensor ring as a graded ring whose components are the tensor powers 
of U. More generally, we can replace K by a general ring A and take U to be an 
A-bimodule. We define the n-th tensor power of U as 


U" =U @,U @4...@, U (n tactors). (27.1) 
Now the tersor A-ring on U is defined as 
TOS 0 a0", (2.7.2) 
where the multiplication is defined componentwise by the isomorphism 
U'@U’ zUT O73) 


which follows from the associative law for tensor products. If A is commutative and 
the left and right actions on U agree, the ring defined by (2.7.2) is an A-algebra, but 
for general A the ring T,(U) so obtained is an A-ring, i.e. a ring with a homo- 
morphism A — T,(U). If U is the free K-module on a set X, we write Ty(U) as 
K(X); this is just the free K-algebra on X (see BA, Section 6.2). The ring T,(U) 
has the special property that any A-linear mapping of U into an A-ring R can be 
extended to a homomorphism of T,4{U) into R: 


Theorem 2.7.1. Let A be any ring and U an A-bimodule. Then T.,(U ) 1s the universal 
A-ring for A-linear mappings of U into A-rings: there is a homomorphism 
A: U + T4(U) such that for every A-linear map f : U > R into an A-ring R there 
is a homomorphism f* :T4(U) > R such that 


faafe. (2.7.4) 


Proof. The map A may be taken as the embedding which identifies U with U'. Given 
an A-linear mapping f : U — R, we extend f to T,(U) by defining 


Cr ae it Sa ce D. 


By the properties of the tensor product this defines a mapping f* from U to R, which 
is easily seen to be a homomorphism; f* is unique since it is determined on the 
generating set U and (2.7.4) holds, almost by definition. | 


It turns out that derivations have the same property. We recall that for any ring 
homomorphisms a: C— A, 8: C > B and an (A, B)-bimodule M, a mapping 
6:C— M is called an (qa. B)-derivation if 6 is linear and satisfies 


(xy)? — ae a + xy? forall x.yeEC. (2.7.5) 


A M 


By means of the triangular matrix ring ( Ke ) we can rephrase the definition by 


2.7 Tensor algebras, universal derivations and syzygies 79 


saying that 6: C — M is an (a, f)-derivation precisely if the mapping 


( ) 

XI> eC, 

O x? 

is a homomorphism. The proof is a simple verification which may be left to the 
reader. By applying Theorem 2.7.1 we thus obtain 


Corollary 2.7.2. Given an A-bimodule U over a ring A and linear mappings a, B of U 
into A, denote the extensions to T,(U) (which exist by Theorem 2.7.1) again by a, B. 
Then any (a. B)-derivation 6 of U into an A-bimodule M can be extended in just one 
way to an (a, B)-derivation of T,(U) into M. ea 


Any derivation followed by a module homomorphism is again a derivation and we 
can ask whether there is a module with a derivation which is universal. For simplicity 
we shall assume that w = 6 = 1. Thus R is an A-ring and we are dealing with deriva- 
tions from R to R-bimodules. These derivations form the objects of a category in 
which the morphisms are commutative triangles and we are looking for an initial 
object in this category, i.e. an R-bimodule Q with a derivation 6: R— Q such 
that every derivation from R can be uniquely factored by 6. Such a bimodule 
indeed exists and can be described explicitly. 


Proposition 2.7.3 (Eilenberg). Let R be any A-ring with multiplication mapping 
U:R@,R— R. This mapping gives rise to an exact sequence 


0+ Q2-—+R@Q,R->R— 0. (2.7.6) 


where Q = ker a is an R-bimodiule, generated as left (or right) R-module by the 
elements x ®1—1@x (x € R) and split exact as left or right R-module sequence. 

This R-bimodule Q is the universal derivation bimodule for R, with the universal 
derivation 


d:XIPxBl—-1@x (xeER). (27) 


Proof. It is clear that the multiplication uw : x ® y!— xy is an R-bimodule homo- 
morphism and its kernel &2 contains the elements x @1—1@-x. Suppose that 
Yo x, @¥, € Q; then >> x,y); = 0 in R and so 


Yo Oy = (4 @1l-1@x)yi = 4 By —y, OV. 


This shows Q to be generated by the elements x @ 1 — 1@-x as left or right R- 
module. 
Now the mapping (2.7.7) is a derivation: 


(xy) = xy @1-1@xvy=xy@l—-—1@y)+(x@l—-1@x)y 


Ps) } 
= x.y + xy, 


80 Homological algebra 


It is universal, for if d: R > M is any derivation, we can define a homomorphism 
f :2—> Mas follows: if }° x; @ y; € Q, then >> x;y; = 0, hence yi xt yj + x.y! = 0 


and so we may put 
(- ee @yi)f = yo xt y; os yy: (2.7.8) 


This is an R-bimodule homomorphism, since () a(x; ® yi))f = — >- ax;.y4, 
(>> (x; @y,Ja)f = Six! yia. Moreover, df :xtex@l—1@xtrex*. Hence 
d = df and f is unique since it is prescribed on the generating set {x ®@ 1 — 1 @ x} 
of Q. Finally to show that (2.7.6) is split exact, we observe that the mapping 
o@:R— R®@, RK given by xi>x @ 1 is a left R-module mapping (and x1> 1@xa 
right R-module mapping) such that du = 1. + | 


The module & in (2.7.6), regarded as a universal derivation bimodule of R, is often 
denoted by &2,(R). We shall be particularly interested in this bimodule when R is the 
tensor A-ring on an A-bimodule U, R = T,(U). In that case we can give an explicit 
description. 


Proposition 2.7.4. Let A be a K-algebra, U an A-bimodule and R = T,(U) the tensor 
A-ring on U. Then the universal derivation bimodule of R is given by 


Q,4(R) =R@UOR (2.7.9) 
and the exact sequence 
0—> R@U@®R—> R@R->R—> 0 (2.7.10) 


is split exact as sequence of left (or right) R-modules. 
Here and in the proof that follows, all tensor products are understood over A. 


Proof. As before, we write = ker yz and consider the canonical derivation 
6:R— Q given by x° = 1@x—x@1. We saw in the proof of Proposition 2.7.3 
that Q is generated as left or right R-module by the x° as x ranges over R. But R 
is generated by U as A-ring, therefore Q is generated as A-bimodule by the u°, 
where u € U. Thus the restriction map 6|U : U > Q gives rise to an R-bimodule 
map a: R@U @R-> ROR such that (1 @u@1)* =u’ and clearly im a = Q. 
Now (2.7.10) is split since this is the case for (2.7.6). P| 


To construct free objects in other varieties of algebras it is often simplest to take 
free algebras and apply the factor theorem. We illustrate the method by the case of 
symmetric algebras, which is of some interest in itself. To begin with we describe the 
process of ‘abelianizing’ a ring, which is analogous to the corresponding notion for 
groups (see BA, Section 3.3). 


Theorem 2.7.5. To any ring R there corresponds a commutative ring R® with a homo- 
morphism v: R— R® which is universal for homomorphisms of R into commutative 


2.7 Tensor algebras, universal derivations and syzygies 81 


rings. Thus each homomorphism f from R to a commutative ring can be factored 
uniquely by v. 


Proof. Let ¢ be the commutator ideal of R, 1.e. the ideal generated by all the com- 
mutators xj — yx, where x.y € R, and write R“’ = R/c, with the natural homo- 
morphism v: R ~» R’. Then any homomorphism f from R to a commutative 
ring maps xy — yx to 0, for all x, y € R, hence ker f D c¢, and so by the factor theorem 
(Theorem 1.2.4) f can be factored uniquely by v. i 


We remark that if X is a generating set of R, then the commutator ideal of R is 
already generated by the elements xy — yx for all x.y € X. For let 6 be the ideal 
generated by these commutators and write A : R-—> R/b for the natural mapping. 
Then R/b is generated by the elements xd, x € X. Now fix x, € X; x;A commutes 
with every yA, y € X, so the centralizer of x; contains a generating set of R/b and 
hence is the whole ring, i.e. each x,A lies in the centre of R/b. Since x, could be 
any element of X, the centre contains a generating set and so it is the whole ring, 
1.e. R/b is commutative. This means that b = ker A D c, hence b = ¢ as claimed. 

Given a K-module U over a commutative ring K, let T(U) be its tensor algebra. 
We define the syrnmetric algebra on U as the algebra 


SST Pera 


By what has just been said, we see that $(U ) can be obtained from T(U ) by imposing 
the relations 


xv=vx forall x,yeU, (2.7.12) 


because T(U ) is generated by U. It is clear from the definition that S(U ) is universal 
for K-linear mappings from U to commutative K-algebras, so we have 


Theorem 2.7.6. Let K be a commutative ring. For any K-module U there is a com- 
mutative K-algebra S(U) with a K-linear mapping uw: U — S(U) which ts universal 
for K-linear mappings from U to commutative K-algebras. S(U) can be obtained 
from the tensor algebra by imposing the relations (2.7.12). Oi 


Let 6: U) —> S(U) be any K-linear mapping. We ask: when can 6 be extended to a 
derivation of S(U )? The answer is ‘always’. For what we require is a homomorphism 


eo ay a h 
4 t 18) 
S(U} extending the mapping 


f ( uo ou 
! MI 
; QO u 


Now a simple verification shows that uf and vf commute, for any u, v € U, so by the 
universal property of S(U) there is a unique K-algebra homomorphism from S(U) 
extending f. This proves 


from S(U) to 


82 Homological algebra 


Proposition 2.7.7. Let U be a K-module (over a commutative ring K) and S(U ) its 
symmetric algebra. Then any K-linear mapping 6: U — S(U) extends to a unique 
derivation of S(U). dl 


We also note the following test for algebraic dependence in fields. If E/k is an 
extension field generated by x)....., X,, then it is easily verified that the universal 
derivation module {2;(E) is spanned by the dx;. Let us write D; or 0/0x; for the deri- 
vation of the polynomial ring k[x,,..., X,] with respect to x,; this is the derivation 
over k which maps x; to 1 and x; for j #1 to 0. 


Theorem 2.7.8. Let E/k be a field extension in characteristic 0, and let (x;) be any 
family of elements of E. Then 


(i) the x, are algebraically independent if and only if the dx, are linarly independent 
over E, 

(ii) E/k(x,) 1s algebraic if and only if the dx; span 02,(E) as E-space, 

(111) (x,) 1s a transcendence basis for E if and only if the dx; form a basis for Q;(E) 


over E. 
Proof. This follows from the fact that any polynomial relation f(x),...,x,) =0 
corresponds to a relation )° D,f(x).dx, = 0. «| 


For our last result in this section we shall need a change-of-rings theorem which is 
also generally useful. Let f : R ~ S be a homomorphism of rings; we saw in Section 
2.3 that every S-module U can be considered as R-module ' U by pullback along f; in 
particular S itself becomes an R-bimodule in this way. Further, every R-module A 
gives rise to an induced extension Ay = A @®gS and a coinduced extension 
A! = Hom,(S, A). 


Theorem 2.7.9. Let R, S be any rings and f : R > Sa homomorphism. If S is projective 
as right R-module, then there is a natural isomorphism 

Ext{(U, A’) = Ext¥(U. A) (Us, Ap). (2.7.13) 
If S is flat as left R-module, then there is a natural isomorphism 

Exty(A;,U) & Exti(A.'U) (Us. Ar) (2.7.14) 
and 


Tor*(Ay.U) & Tor®(A.'U) (Apis U). (2.7.15) 


Proof. Let us take an injective resolution 0 — A -—T/ and apply the functor 
Hom x(S. —), which is exact because Sp is projective. We obtain an exact sequence 


(ee 


2.7 Tensor algebras, universal derivations and syzygies 83 


and here the terms I! are injective, as coinduced modules, so this is an injective 
resolution of A’. If we now apply the hom functor to this resolution and bear in 
mind that by (2.3.3) and (2.3.4), 


Homs(U, I’) = Hom ,(U.1,,). 


we obtain (2.7.13). Similarly, if gS is flat, then ~ @p S is exact. Hence if P > A > 0 
is a projective resolution, then so is P; —> Ay — 0. We now apply the hom functor 
and use the fact that 


Hom,((P,,);, U) = Homa(P,,, 'T) 
to obtain (2.7.14); in the same way we apply the tensor product to the isomorphism 
(Py); @s U & Py @x!U 
to obtain (2.7.15). Ps 


We conclude this section by finding an estimate for the global dimension of a 
tensor ring: 


Theorem 2.7.10 (Yu. V. Roganov [1975]). Let K be any commutative ring, C a K- 
algebra with r.gl.dim(C) = n, U a C-bimodule and R = T,.(U) the tensor C-ring on U 
with canonical map f :C > R. 


(i) If (CU is flat, then 
n<r.glhdim(R) <n+ 1, (2.7.16) 


and r.gl.dim(R) = n+ 1 if and only ifhd(A @. U) = n for some right C-module 
Fe 

(1) If Uc ts projective, then (2.7.16) holds, and r.gl.dim(R) = n+ 1 if and only if 
cd(Homc(U, B)) = n for some right C-module B. 

(ii) If (U ts flat and w.gl.dim(C) = m, then 


m <w.gl.dim(R) < m+}, (2.7.17) 
and w.g].dim(R) = m+ 1 if and only if wd(A @. U) = m for some Ac. 


Proof. (W. Dicks) Throughout this proof all tensor products are understood to be 
over C, unless otherwise stated. 

Let 22 = R@U @R be the universal derivation bimodule for R and consider the 
sequence (2.7.10). Since it is split exact, it remains so on tensoring (over R) with any 
right R-module M. Recalling that M qua C-module is just 'M, we thus obtain the 
exact sequence 


0>'MQUQR>IM@RO-M—O. 
which can also be written 


00M @U);> UM) > M= 0. 


84 Homological algebra 


Hence we obtain the exact homology sequence 
_. > Ext(M, N) > Ext)(({M);,N) = Ext((4M @ U),,N) 
—> Extp* (M,N) >... 


for any right R-module N. Since ¢U is flat, so is -R and so by Theorem 2.7.8 this 
simplifies to 


_— Exth(M,N) > Ext!<'M.!N) > Ext!(/M @ U.'N) > Ext,’ (M,N) >... 

(2.7.18) 

It follows that rgldim(R)<n-+1. Moreover, by the definition of 1, 
Ext?.t ('M,.'N) = 0, so we have a surjection 

Ext!.(1M @ U.'N) — Ext"*!'(M.N) > 0. (2.7.19) 


We next show that rgl.dim(R) >. Choose right C-modules A, B such that 
Ext?.7'(A, B) #0 and consider A, B as right R-modules with trivial U-action, ice. 
AU = BU=0. The C-module structure is then recovered by pullback along f- 
Taking M = A, N = B in (2.7.18) and observing that Hom,g(A. B) — Hom, (A, B) 
is then an isomorphism, we conclude by exactness that Hom,<(A, B) > 
Homc:(A @ U. B) is then the zero map. The same applies if we resolve B, hence by 
(2.7.18) we have the exact sequence 


0 — Ext.” '(A @ U, B) > Exth(A. B) > Exti\(A, B) > 0. (2.7.20) 


It follows that Ext}(A, B) 4 0, and so r.gl.dim(R) > n; this proves (2.7.16). Now if 
hd(A @ U) =n for some Ac, then by (2.7.20) with n replaced by n+ 1, we have 
r.gl.dim(R) =n+ 1, while if hd(A @ U) <n for all Ac, then hd(‘M@U) <n 
and by (2.7.19), r.gl.dim(R) < n. 

The proof of (ii) is similar. For any right R-module M we have 


Homg(R @ U @R, M) = Homc(R, Homg(U @ R, M)) 
~ Hom-(R, Homce(U.!M)) & Home(U.'M)!. 


In particular, Home(R @ R. M) = (1M)!; hence if we apply Homa(—, M) to (2.7.10) 
we obtain the exact sequence 


0 — Homc(U./M)! > (JM)! > M > 0, 
and using Theorem 2.7.9 we thus find the exact sequence 
.—> Extp(M.N) > Ext.(M,N) — Ext!(Hom;-(U.{M),!N) > Ext,*1(M, N) 
We see again that r.gl.dim(R) < + 1, and the surjection (2.7.19) is replaced by 
Ext?.(Hom:-(U. M). N) — Extp* '(M,N) > 0. 
The argument as before gives the analogue of (2.7.20): 
0 — Ext” '(Hom,(U, A). B) > Ext}(A, B) > Ext?(A. B) > 0 


and the rest follows as before. 


2.7 Tensor algebras, universal derivations and syzygies 85 


To prove (ili) we apply @N and obtain the exact homology sequence 


> Tor* 


i+ | 


(M,N) —> TorS('M @ U.'N) > TorS(/M.'N) > Tor’(M.N)—>.... 


Hence w.gl.dim(R) < m-+ 1. We obtain the exact sequence 


0 — Tor’ (M,N) > Tors ('M @ U,'N), (2.7.21) 
and instead of (2.7.20) we find 
0 > Tor' (A, B) > Tor® (A.B) > TorS_ (A @U, B) > 0 (272) 


28} 


It follows that Tor“ (A, B) 4 0, so w.gl.dim(R) > m. If wd(A @ U) = m for some 
Ac, then replacing m by m-+1 in (2.7.21) we find w.gl.dim(R) = m+ 1; if 
wd(A @U) < m for all Ac, then wd(‘M@U)<m and by (2.7.22) we have 
w.gl.dim(R) < m. | 


Taking U = C, we obtain what is in effect the Hilbert syzygy theorem: 


Corollary 2.7.11. For any ring C and a central indeterminate x, r.gl.dim(C|x]) = 
r.gl.dim(C) + 1. 


Proof. If r.gl.dim(C) =n, then U" =C®...@C&C, as right C-module. If the 
generator of U is written x, then it is easily checked that T(U) & C[x]. The rest is 
clear from Theorem 2.7.10 (i), since AQ U ZA. Li 


In particular, since any field is semisimple, we have 
gl.dim(k[x).....: |) ere 


for any field k, even skew. 

This means that when we resolve a module M over the polynomial ring 
k{x,,...,X,] by means of a free resolution F = (F;) say, then the submodule of rela- 
tions in Fo is not generally free and any generating set for these relations has further 
relations. These relations are known as syzygies, from the Greek “avgvyoo’: yoked or 
paired (usually applied to the conjunction of heavenly bodies). By the above result, 
any free resolution of at most n steps leads to a projective module; Hilbert’s theorem 
states slightly more than this, namely that the free resolution F can be chosen so as to 
terminate (in a free module) at the n-th step (see Eisenbud (1995)). This sharpening 
is also a consequence of the rather deeper theorem of Quillen and Suslin (see e.g. 
Lam (1978)), which states that every finitely generated projective module over the 
polynomial ring k[{x),..... x, | over a field k is free. 

When K is any commutative ring and U is free as K-module, then by taking C = K 
in Theorem 2.7.10 we obtain a formula for the free K-algebra K(X): 


Corollary 2.7.12. For any commutative ring K and any set X, 
r.gl.dim(K (X)) = |.gl.dim(K (X)) = gl.dim(K) + 1. 


In particular, a free algebra over a field is hereditary. i 


86 


Homological algebra 


Thus for any free algebra k{X) over a field k, every right ideal (and every left ideal) 


is projective. In fact these ideals are free, of unique rank (see Cohn (1985)); for the 
case of finitely generated right (or left) ideals this will be proved in Section 8.7 and 
the full result is in Section 11.5 below. 


Exercises 


BW oo 


10. 


. Verify that Ui S(U) is a functor. 
. Show that S(U @ V) = S(U) @ S(V). 
. Given a surjective homomorphism p : U — V of K-modules (where K is a com- 


mutative ring), show that S(jz) : S(U) — S(V) is surjective. 


. Writing [x.y] = xy — yx, verify the identity [xy.z] = x[y,z] + [x. zly, which 


expresses the fact that the mapping ui [u,z] is a derivation. Use this result 
to give another proof of the remark following Theorem 2.7.5, that the commu- 
tator ideal of a ring R is generated by all [x, }’], where x, y range over a generating 
set of R. 


. Find extensions of Proposition 2.7.3 and Proposition 2.7.4 to (@. 8)-derivations. 
. Given a situation (Ag. Bs), where Ap and Bs are flat, show that (A @ B), is flat. 


Deduce that if a C-bimodule U is flat as right C-module, then so is T-(U ). Do 
the same for ‘projective’ in place of ‘flat’. 


. Find an extension of Theorem 2.7.8 to the case of prime characteristic. 
. Apply Theorem 2.7.8 to prove that the transcendence degree (in characteristic 0) 


is an invariant of the dimension. What can be said in prime characteristic? 


. A finitely generated R-module M is called stably free if integers r, s exist such that 


M @®R = R.. Given that every finitely generated projective over a polynomial 
ring is stably free, derive Hilbert’s form of the syzygy theorem from Corollary 
27 Ad: 

Let k be a field and R the k-algebra generated by (disjoint) finite sets X),.... X, 
with the defining relations xy = yx precisely when x, y lie in different sets. Show 
that gl.dim(R) =r. Find the global dimension of R when the relation xj = yx 
holds precisely for x, }’ in the same set. 


Further exercises on Chapter 2 


I 


Let K be a commutative ring; describe products and coproducts in the category 
of K-algebras. 


. Let © be a small category admitting products. If for some objects X, Y, 6(X, Y) 


has two distinct members, show that @(X, Y‘) has at Jeast 2"! members, for any 
set I. Deduce that any %(X. Y) has at most one morphism (and so @ is a pre- 
ordered set). 


. Show that every map a: A — B in an abelian category gives rise to an exact 


sequence 


0 kera—> A—> B=> cokera > 0. (2.8.1) 


2.7 Tensor algebras, universal derivations and syzygies 87 


10. 


Il. 


13; 


14. 


16. 


Given f : A — Band g: B — A such that fg = |, show that fis monic and g is 
epic. By applying the windmill lemma (Exercise 10 of Section 2.1) to the 
sequences obtained from (2.8.1) for f, g, deduce that A is isomorphic to a 
summand in a coproduct representation of B. 


. Prove Yoneda’s lemma for additive categories: If F : .e/ — Ab is given and for 


péX" a natural transformation p®: h* — F is defined by the rule that for 
ae (X,Y), a> pa? maps Yh* to Y", verify the naturality and show that 
the resulting map X* — Nat(h*, F) to the set of natural transformations is an 
isomorphism. 


_ If G: .s/ > Ab is a contravariant functor, show that X° & Nat(hy, G), where 


hy : ¥Y — A(X. Y) 1s defined by the rule 
A (X,Y) & Nat(h’. h*) & Natthy, hy). 


. Use Yoneda’s lemma to show that the left adjoint of a functor, if it exists, is 


unique up to isomorphism. 


. Show that a functor is exact iff it is right exact and preserves monics or left exact 


and preserves epics. 


. Show that any left exact functor preserves pullbacks. 
. Show that in any category -p/ the product of two .e/-objects X, Y is the object 


representing the functor At> ./(A.X) x -/(A.Y) (if it exists). Similarly their 
coproduct is the object representing A I> .o/(X. A) x oV(Y. A). 

Let X be an object in an abelian category. Assuming that the equivalence classes 
of subobjects of X form a set, partially ordered by inclusion, show that this set is 
a modular lattice. 

In an additive category consider the sequence 


p—.a[[B--@ 


Show that if 4, j, p, q are the natural injections and projections of the biproduct 
A|]|8, then the square formed by the maps Ap, —Ag. i. jj commutes if 
Aft = 0, is a pullback if A = ker yz and is a pushout if uw = coker 4. 


. Given a pullback in an additive category (with notation as in Section 2.1) show 


that a’ is monic iff @ is (for abelian categories this follows from Proposition 
2.1.6, but not in general). 

Let P be a pullback of a: A > C, B: B > C in the category of rings. Show that 
if P is an integral domain, then one of a. f, say w is injective, and P is isomorphic 
to a subring of B. 

Show that in an abelian category the intersection of two subobjects of a given 
object can be defined as a pullback and describe the dual concept. 


. Given a 3x3 ‘matrix’ of short exact sequences between modules Uj; 


(1. j = 1.2.3) forming a commutative diagram as in the 3 x 3 lemma, show 
that the kernel of the composite map Us. > Us, is im(U,2) + im(U))). 
Deduce that for any two short exact sequences of modules U;. V; the kernel of 
the map LU, @ Vo > U; @ V2 is im(U, ® Vo) + im( U2 @ V)). 

Let ./, & be abelian categories with direct sums and F, G two right exact 
functors from .¥ to :-4, which preserve coproducts. Show that if there is a natural 


88 


li. 


oi 


to 
tO 


to 
ty 


24. 


Homological algebra 


transformation t : F > G such that for a generator P of .o/, t: P’ = P® is an 
isomorphism, then t is a natural isomorphism. (Hint. Apply t to a ‘presentation’ 
of an object and use the 5-lemma.) 

(Eilenberg—Watts) Show that for any functor S$ : Mod , — Mod, the following 
are equivalent: (a) S has a right adjoint T : Modg — Mod,z, (b) S is right 
exact and preserves coproducts, (c) $ = — ®, P, for some (A, B)-bimodule P, 
unique up to isomorphism. Show that when (a)-(c) hold, the adjoint is 
Y' = Hom;(P. Y). (Hint. Use Exercise 16; see also Section 4.4.) 


. Show that for a finitely generated projective R-module P, Hompg(P, M) = 


P* @ M, where P* = Homp(M. R). (Hint. Use Exercise 16.) 


. Acomplex (C, d@) is said to split if it is homotopic to (H(C), 0). Show that (C, d) 


splits iff (a) C has a homotopy s of degree 1 such that dsd = d or equivalently, 
(b) B(C), Z(C) are direct summands of C. (Hint. If s is a homotopy as in (a), 
then sd is a projection on B(C) and 1 — ds is a projection on Z(C).) 


20. Given a chain map f : X — X’ between complexes, to obtain an exact sequence 


which includes the maps f®: H,,.X —> H,,X', define a new complex M(f ), the 
mapping cone of f, as M, =Xn,, @X), with (x, x’)d = (—xd,x'd4+xf). 
Verify that there is a short exact sequence 


03> X'>M>) X90, 


where ( >°X),, = X,- 1, and obtain the associated exact homology sequence. 
Deduce that M(f ) is acyclic iff f® is chain equivalence. 

Given a chain map f : X — X‘ between complexes, define the mapping cylinder 
N(f) on f as the complex N,=C, ®C,-; OC, with (x y.x')d= 
(xd + y, —yd,x'd — yf). Verify that there is a short exact sequence 


C38 NG SMG) =o. 


where M(f) is the mapping cone from Exercise 20. Using the associated 
homology sequence, show that a: X'—> N(f), B: N(f) > X' are mutually 
homotopy inverse, where a : x'1> (0,0.x') and B: (x.y, x')ieox' 4+xf. 


. (Eilenberg trick) Let P be a projective module, say F = P © P is free. By express- 


ing the direct sum S of countably many copies of F in two ways, show that 
P@S=S. 


k 
. Show that the triangular matrix ring (; ) over a field k is hereditary. Is it a 


principal ideal ring? 
Given a short exact sequence of modules 


0-A>A>A —O, 


show that hd(A) < maxhd(A’), hd(A"”) with equality except possibly when 
hd(A") = hd(A’) + 1. Find similar bounds for hd(A’), hd(A”) in terms of the 
other two. 


. Let R be a ring and M an R-module. Show that if gl.dim(R) =r and 


hd(M) =r—1, then every submodule of M has homological dimension at 
most r — l. 


2.7 Tensor algebras, universal derivations and syzygies 89 


26. 


28. 


30. 


A ring is said to be right semihereditary if every finitely generated right ideal is 
projective. Show that in a right semihereditary ring the finitely generated right 
ideals form a sublattice of the lattice of all right ideals. 


. A ring R is said to be weakly semthereditary if, given finitely generated projective 


modules P. Py. P;} and maps a: Py > P, B: P > P, such that aB = 0, there is a 
decomposition P = P’ @ P” such that ima C P’ C ker f. Show that R is weakly 
semihereditary iff for any matrices A € "R", Be "R° such that AB = 0 there 
exists an idempotent xn matrix E over R such that AE=A, EB=0. 
Deduce that the condition is left-right symmetric. Show also that every right 
(or left) semihereditary ring is weakly semihereditary. 

Show that over a right semihereditary ring R every finitely generated submodule 
of a projective module is projective and that every finitely generated projective 
module is isomorphic to a direct sum of a finite number of finitely generated 
right ideals. 


. Show that an injective R-module EF is a cogenerator iff Homz(S, E) ¢ 0 for every 


simple R-module S. Verify that Q/Z is an injective cogenerator for Z (see also 
Section 4.6). 

Show that for any commutative ring K, Ext'}(A,B) and Tor*‘(A, B) have a 
natural K-module structure. Show that if, moreover, K is Noetherian and A, B 
are finitely generated, then Ext; (A. B) and Tor (A. B) are finitely generated as 
K-modules. 


Further group theory 


Group theory has developed so much in recent years that a separate volume would 
be needed even for an introduction to all the main areas of research. The most a 
chapter can do is to give the reader a taste by a selection of topics; our choice was 
made on the basis of general importance or interest, and relevance in later applica- 
tions. Thus ideas from extension theory (Section 3.1) are used in the study of simple 
algebras, while the notion of transfer (Section 3.3) has its counterpart in rings in the 
form of determinants. Hall subgroups (Section 3.2) are basic in the deeper study of 
finite groups, the ideas of universal algebra are exemplified by free groups 
(Section 3.4) and linear groups (Section 3.5) lead to an important class of simple 
groups, as do symplectic groups (Section 3.6) and orthogonal groups (Section 3.7). 
We recall some standard notations from BA. If a group G is generated by a set X, we 
write G = gp{X}, and we put gp{X|R} for a group with generating set X and set of 
defining relations R. For subsets X, Y of G, XY denotes the set of all products xy, 
where x € X.y € Y. We write N <aG to indicate that N is a normal subgroup in 
G, i.e. mapped into itself by all inner automorphisms of G. If H, K are subgroups 
of G, then HK is a subgroup precisely when HK = KH; in particular this holds 
when H or K is normal in G. We also recall the modular law: given subgroups K, 
L,Mof G,if K CM, then K(LNM) = KLOM. 


3.1 Group extensions 


In BA, Section 2.3 we saw that every finite group has a composition series; the factors 
of this series are simple groups and for the structure of G one has (1) to study these 
simple groups, and (ii) to determine how G is composed from them. For the 
moment we concentrate on (ii); in its simplest form this is the extension problem. 
It is of great general interest and some special cases will be of use to us later. 

An extension of groups may be written as a short exact sequence 


pee ae (3.1.1) 


As in the case of abelian groups (Z-modules) this means that A is isomorphic to a 
normal subgroup A, of E and E/A, = G. We shall call E an extension of A by G. 


92 Further group theory 


Given any two groups A, G, their direct product G x A is such an extension, but in 
general there will be many others. Two extensions of A by G, say E, and E) are said to 
be isomorphic if there is an isomorphism f : E; — E> that the diagram 


ING 
eae 


l 


commutes. By the 5-lemma any such homomorphism f is an isomorphism. The 
aim of extension theory is to obtain a survey of all (isomorphism classes of) exten- 
sions of a prescribed pair of groups. In principle this is solved by Schreier’s theorem 
(Theorem 3.1.2 below), but the solution is not very explicit and the description in 
terms of cohomology is more illuminating, although not exhaustive. 

We begin by discussing an important special case, that of split extensions. The 
extension (3.1.1) is said to split if there is a homomorphism a@ : G — E such that 
wt = 1; this means that we can choose the transversal for the subgroup AA in E 
to be a subgroup (not necessarily normal in E£). Consider any split extension 
(3.1.1) and denote the images of G, A in E by G,, A, respectively. Any element x 
of E has the same image xp as some element g € G, thus (g~ 'x)u = 1, and by 
exactness, g~/x © A,. Hence each x € E has the form 


x= ga, where g eG, ae€Aj. (3.2) 


This just means that E = G,A). Next we observe that G) M A; = 1, for the restriction 
of jz to G is injective and its kernel is G) 1 A,. It follows that the expression (3.1.2) 
for x is unique: if x= ga=g'a’, where g,g’'€G,, a.a’€ A), then a’a~'! = 
g’ 'g €G, NA, =1, hence a’ =a, 9’ = g. Let us identify G with G, and A with Aj, 
so that we have E = G x A, as sets. By hypothesis A is normal in £; if G is also normal 
in E, then E is just the direct product of G and A, as is easily checked, but in general G 
need not be normal in E; each element of G then defines an inner automorphism of 
E, which induces an automorphism of the normal subgroup A. If we write a, for the 


automorphism of A induced by g € G, then we have the commutation rule 
ag=g.aa, for anyaeA,geG. i a 
It is easily verified that the mapping 
gi? ay (3.1.4) 


is a homomorphism from G to Aut A, the automorphism group of A, and (3.1.3) is 
enough to determine the multiplication in E: 


(ga)(g a’) = gg ’.(aw, Ja. (3.1.5) 


Thus the extension E is determined by G, A and the action of G on A; this is just 
the semidirect product of G and A with the action a, which we have met in BA, 


3.1 Group extensions 93 


Section 2.4, denoted by Ge<A or GP<, A. Given any groups G, A and a homo- 
morphism @:G->» Aut A, we can define a group structure on the set G x A by 
the rule 


(g, a)(g', a’) = (gg'. aay a’). 


We saw in BA, Section 2.4 (and can easily verify directly) that E is a group in which 
the elements (g, 1) form a subgroup isomorphic to G and the elements (1, a) form a 
normal subgroup isomorphic to A, with quotient isomorphic to G; thus E is a split 
extension of A by G. We state the result as 


Theorem 3.1.1. Let G, A be any groups and a: G-—> Aut A a homomorphism. Then 
the semidirect product G ><, A is a split extension of A by G with action a, and all split 
extensions arise in this way. + | 


For example, the dihedral group D,, of order 2m can be written as a semidirect 
product of cyclic groups: 


D,, = Cc D< Cis 


where C; acts on C,, by the automorphism x i> x7!. 


Let us now return to general extensions. Given any extension (3.1.1), where we 
take A to be identified with its image in E by means of A, let us denote the elements 
of A by latin letters and those of G by greek letters. For each a € G choose an element 
&a € E which maps to a, taking g; = 1 in order to simplify later formulae. In general 
it will no longer be possible to choose the g, to form a subgroup (they form a trans- 
versal of G in E), but in any case we have g.g4 = g.p (mod A), hence 


LakB = Lape p Where typ € A, (3.1.6) 
and since g; = 1, we have 
Net = Mig SA (301.7) 
Further, each g, induces an automorphism 6, of A: 
g, AS. -—ad, forallacA.a~eG, (3.1.8) 


where 6; = 1. It is no longer true that 6 has to be a homomorphism, 1.e. 6,0, will not 
in general equal 6,, but will differ from it by an inner automorphism; by (3.1.6) we 
have 


O64 = Ogptl Me. p). (3.1.9) 
where for any x € A, (x) : ut x7 ‘ux is the inner automorphism defined by x. As is 
easily verified, the set Inn A of all these inner automorphisms is a normal subgroup 


of Aut A, the group of all automorphisms of A. The quotient Aut(A)/Inn(A) is called 
the automorphism class group and (3.1.9) shows that we have a homomorphism 


6: G- Aut(A)/Inn(A). (3.1.10) 


94 Further group theory 


The set my. is called a factor set of the extension, normalized because it satisfies 
(3.1.7). In E we have gu(gpgy) = Sa8pyMp.y = Sapy Ma. py py and (88) = 
Lupe. py = LapSy (Ma, poy) = Lupy Map. (Map), hence by the associative law we 
obtain the factor set condition 


Me py Mp y = Mupy(Me pO,) forall a. By €G. (3.1.11) 


Conversely, given any groups G and A and mappings 6:G-— AutA and 
m:G* — A such that 6; = 1 and (3.1.7), (3.1.9) and (3.1.11) hold, we can define 
a multiplication on the set G x A by putting 


(a, a)(B, b) = (@B, mya, p(aOp)b): (3.1.12) 


then it is straightforward to verify that we obtain a group in this way, which is an 
extension of A by G with factor set {:, ,} and automorphisms 6,. The associative 
law follows from (3.1.11), the neutral is (1.1) and the inverse of (a,a) 1s 
(a~',(m,., ,a-')@,'); the verification may be left to the reader. Here the normal- 
ization (3.1.7) is not essential, but without it the formulae become a little more 
complicated. 

Let E be an extension with factor set {1.4} arising from the transversal {g,}. If 


{g/} is a second transversal, then 
a Svcs. IWhererty GAG; =, 


and the new factor set {mm ,} obtained from the g, is related to the old by the equa- 
tions aplapMy 1 = Supt p = Lo 4 = Lula kp = LaSp (Culp )Cp = Sapte. p(CuPp Cp. 
Hence 

My p= Cap Ma, plCOp ep: (3.1.13) 
we shall express (3.1.13) by saying that {mm ,} and {m_,, ,} are associated. Similarly, 
we obtain from (3.1.8) 


dQ = 6,1(c,). (3.1.14) 


au 


Conversely, if Mog and 6. are defined by (3.1.13) and (3.1.14), they lead to an exten- 
sion isomorphic to the given one, as we see by retracing our steps. We sum up these 
results in 


Theorem 3.1.2 (Schreier’s extension theorem). Given two groups G and A, a homo- 
morphism @:G— Aut(A)/Inn(A) and a map m:G* > A satisfying the factor set 
condition (3.1.11), we can define a multiplication on the set Gx A by (3.1.12) and 
so obtain a group which is an extension of A by G. All extensions of A by G are obtained 
in this way and two extensions are isomorphic if and only if their factor sets are 
associated. Ps | 


To prove this result in full, some verifications are needed, which are straight- 
forward (though tedious) and will therefore be left to the reader. Instead we shall 
examine a special case, important for the applications, in more detail. 


3.1 Group extensions 95 


Let us assume that A is abelian; in that case Inn A is trivial and so the map 
4: G-> Aut Aisa homomorphism. Moreover, as (3.1.14) shows, the automorphism 
6, depends only on @ and not on the choice of 6. So in this case we have an action 
of G on A (by automorphisms) and in place of x6, we shall simply write x*. This 
action is trivial precisely when A is contained in the centre of E; we call this a central 
extension. 

In what follows we shall write A as an additive group, but G will still be multi- 
plicative. The action of G then turns A into a G-module and the factor set condition 
(3.1.11) now reads 


He to IN = Mais ee (3.1.15) 
while associated factor sets are related by the equations 
Mp epee. eye (3.1.16) 


Our aim is to obtain a homological description of all extensions with abelian kernel. 
We recall that G-modules may equally well be regarded as ZG-modules, where ZG is 
the group algebra of G over Z. Moreover, any left G-module A can be regarded as a 
right G-module by using the canonical antiautomorphism of ZG. Explicitly we put 


ag=g ‘a forallacA.ge€G. 


Let A, B be any right G-modules and consider Hom(A, B), the group of all (abelian 
group) homomorphisms from A to B. We can define a G-module structure on this 
group as follows. If f € Hom(A, B), we put 


fai (flas™ Ne. forse CG. (3.1.17) 
This is indeed a G-module structure, for (3.1.17) can be written 
f (as) = fla)s, 
hence f “(ast) = (f(a))st = (f*(as))t = (f*)‘(ast), therefore f* = (f*)', and the 


remaining laws are clear. 
For any G-module A we define 


A’ ={x€ Alxs=x_ for all s € G): 


thus A® is the largest submodule left fixed by G. For example, if A. B are any G- 
modules, then f € Hom(A, B) is left fixed by G iff f* =f, i.e. f(as) = f(a)s for all 
s € G. This just means that f is a G-homomorphism from A to B, hence we have 


(Hom(A, B))° = Hom,;(A, B). (3.1.18) 


In particular, regarding Z as a G-module by the trivial action: ns = n for all n € Z 
and s € G, we have 


Hom,,(Z. A) = (Hom(Z. A))° & A®. (3.1.19) 


It is clear from (3.1.19) that the functor A i> A® is left exact; we can therefore form 
the right derived functor, as described in Sections 2.5 and 2.6. The n-th right derived 


96 Further group theory 
functor of A® is written H"(G, A) and is called the n-th cohomology group of the G- 
module A. By (3.1.19) we see that 

H"(G, A) = Ext"(Z, A), H°(G,A) =A®%, 


where for the subscript on the right we have written G rather than ZG. We note that 
for any coinduced module A we have H"(G, A) = 0. For ZG is free as Z-module, 
hence by the change-of-rings formula (2.7.13), we have 


Ext/\(Z. B') = Ext3(Z.B) =0 forall n> 1. 


This holds for any coinduced module B’ = Hom (ZG, B), where f : Z — ZG. 
Similarly the n-th homology group of the G-module A is defined as 


H,,(G. A) = Tor’ (Z, A). 


If A is induced, say A = By = ZG @z B, then Tor? (Z, By) = Tor@(Z. B) = 0 for all 
n > 1, hence we have H,,(G, A) = 0 for any induced G-module A. 

Let ¢ : ZG + Z be the augmentation map, defined as € : )\ a,s' )°a,. Its kernel 
IG is called the augmentation ideal of ZG, and from the split exact sequence 


0- I1G—> 7G->Z-— 0 (3.1.20) 
we obtain, by tensoring with a G-module A, the exact sequence 
0- IG)A> A> Z@A— 0. 


Hence Z@A =~ A/(IG)A and we see that H,,(G, A) is the left derived functor of 
A/(IG)A. We note that A/(IG)A, sometimes written A;,, is the largest quotient of 
A with trivial G-action, as is easily seen. Hence we find that 


From the exact sequence (3.1.20) and the definitions of H,,, H" we obtain by shifting 
dimensions, 


H,,(G. A) = Tor’_ (A, IG), (3.1.22) 


H"(G, A) = Ext?” '(IG, A). (3.1.23) 


We recall that to construct H,,(G. A) or H"(G, A) we take a projective resolution X of 
Z (as trivial G-module) and form the complex A @ X or Hom(X. A) respectively. 
Taking first H,,(G, A) let us form A @ X, where the X,, are now left modules; thus 
X,, consists of all n-chains, i.e. (n + 1)-tuples over G with the action 


B(So.Sy..0., Sn) = (850. BS). --- + WSn). 


The differential d: X, — X,,_, is given by the formula 


(s),...5,)d = a =a eee eee Sy). (3.1.24) 
10 


3.1 Group extensions 97 


where the caret on s, means that this term is to be omitted. The augmentation map 
€: Xy — Zis defined by spe = 1. With these definitions we have d* = 0, as we see by 


counting how often the term (so....,Sp.....5j,--+.5n) Where p <q, occurs in 
(SinciaeSn a. 

To prove exactness, we define a homotopy h: X, — X,., by the rule: 
(Sigss-2% sn )h = (1. S9....,5,) for n > 0, 1.4 = 1 for n = —1. Then it is easily verified 


that hd + dh = 1. This resolution is called the standard (or bar) resolution of Z over 
ZG, in homogeneous form. The elements of ker d, im d are called cycles and bound- 
aries respectively. Sometimes the inhomogeneous form is more convenient to use; 
this is defined as 


Pt icae lS dich bacdeliseati es Heth: (3.1.25) 
Now the differential is given by 
[ty ft}. [tald = titel dtu) + D0 C= D' Leh. bei altitia teal Mtl 
is eee ies peace de 
In low dimensions we have 
(le = (1), [ghd = gl]—-l] =g—1. [glh]d = gh] —[gh} + 1g}. (3.1.26) 


As an illustration let us calculate H,(G, Z). We take the resolution X of Z and tensor 
with Z: 


.. 27 £2Q@X% > Z8X, > ZOX%- Z— 0. 


Since Z is a trivial G-module, we see that at Z @ X, the kernel is all of Z @ X, while 
the image is generated by the elements 1 @u, where u = [h] — [gh] + |g], by 
(3.1.26). Thus H)(G, Z) consists of all sums of terms 1 ® [g] subject to the relation 


1 @ [gh] = 1 @ (g] +1 @ [A]. 


This is just G made abelian, G?? = G/G’, hence we have 
Proposition 3.1.3. For any group G we have H,(G.Z) = Tor}(Z.Z) = G™. a 


We next turn to the construction of H"(G, A) using the standard resolution. 
The elements of Hom;,;(X,,, A) are n-cochains, those of ker d, im d the cocycles and 
coboundaries respectively. Taking X, now as right G-module, we find that an 
n-cochain is a function f : G"*! — A such that 


FSi Gites SiO) =F (Siaaaes Sig tor g€G. 


Since g is arbitrary in G, f is completely determined by its values when its last argu- 
ment is 1. Let us write 


98 Further group theory 


gy is the inhomogeneous cochain corresponding to the homogeneous cochain f. In 
terms of y the coboundary is given by the formula 


SS TO ticc aks Pisa (3.1.27) 
Let us again write out the cases of low dimensions: 
n= 1.A l-cocycle is a map y: G—> A such that gd = 0. Thus C' consists of all y 
satisfying 
y(ge') = og’) + glg)g - (3.1.28) 
It is a coboundary iff 
y(g) = c — cg for some fixed c € A. (3.1.29) 


A function @¢ satisfying (3.1.28) is sometimes called a derivation (it is an (e, 1)- 
derivation in the sense of BA, Section 6.2) or a crossed homomorphism. If (3.1.29) 
holds, it is said to be inner or a principal crossed homomorphism. Writing 
Der(G, A) for the group of derivations and IDer(G, A) for the subgroup of inner 
derivations, we have 


H'(G, A) = C!/B! & Der(G, A)/IDer(G. A). 


If G acts trivially on A, the derivations are just the homomorphisms and the inner 
derivations are 0, so in this case H'(G. A) & Hom(G, A). Since A is abelian, the 
homomorphisms G — A correspond to the homomorphisms G*” —> <A, and so we 
obtain 


Proposition 3.1.4. For any group G and any module A with trivial G-action ag = a for 
alla € A, g €G, we have 


H'(G, A) = Hom(G, A) = Hom(G"". A). g 
n= 2. A 2-cocycle is a map wy: G > A such that 
y(a, bc) + y(b.c) = y(ab, c) + pla, B)c. 


This is precisely the condition (3.1.15) for a factor set; moreover, a 2-coboundary is a 
factor set of the form y(b) — y(ab) + y(a)b; now (3.1.16) shows that two factor sets 
are associated iff they differ by a coboundary. This establishes 


Theorem 3.1.5. Let G be any group and A a G-module. Then there is a natural bijec- 
tion between the group H~(G, A) and the set of isomorphism classes of extensions of A by 
G with the given G-action on A. BS 


We observe that since H“(G. A) has a group structure, there is a multiplication of 
extensions of A by G. It is obtained by taking the product (resp. sum) of the corre- 
sponding factor sets (resp. cocycles) and is known as the Baer product (resp. sum). 


3.1 Group extensions 99 


Some simple calculations are suggested in the exercises; we add a general result which 
is often useful. 


Proposition 3.1.6. Let G be a finite group of order rand A any G-module. Then for any 
n > 0, each element of H"(G, A) has order dividing r. 


Proof. We define a ‘homotopy mod r by the equation 


(gdh) (to, od a Ey 4 1) S rp(ts fon Bete Ei 4 1) = (phd)( to. Sie aay ty +1). 
Hence we have 


dh+hd=r.1. 


If c is any cocycle, then cd = 0, hence rc = cdh + chd = (ch)d, therefore rc is a 
coboundary and so rH(G, A) = 0, as claimed. = | 


For finite A the order of A annihilates H"(G, A), so Proposition 3.1.6 yields 


Corollary 3.1.7. If G, A are both finite, of coprime orders, then H"(G. A) = 0 for any 
n> 0. Pe 


It is clear from Proposition 3.1.6 that if A is uniquely divisible, as abelian group, 
then H"(G. A) = 0 for any finite group G. This remark can be used to compute H- 
of a finite group over Z: 


Proposition 3.1.8. Let G be any finite group and K = Q/Z the group of rational 
numbers mod 1, as trivial G-module. Then 


H-(G. Z) = Hom(G. K). (3.1.30) 


Proof. The exact sequence 0 — Z > Q — K — 0 leads to the derived sequence 
H'(G.Q) > H'(G,K) > H*(G,Z) > H*(G.Q). 


Since Q is uniquely divisible, the extreme terms are 0 and so the other two are iso- 
morphic: H~(G, Z) = H'(G.K). Since the G-action on K is trivial, H'(G.K) = 
Hom(G, K) and the result follows. = | 


A similar proof, replacing Q by R, shows that H-(G, Z) = Hom(G.T), where 
T = R/Z. More directly we see that Hom(G, T) = Hom(G, K) for any finite group 
G, because every homomorphism of G into T has its image in the torsion subgroup 
of T, and this is just K. We also note that Hom(G, T) = Hom(G. C* ), because 
T = C* by the isomorphism x! exp(2z77ix). 


100 Further group theory 


The group H-(G.K) is called the multiplicator of G. It arises when we consider 
representations of G by linear transformations of a projective space, briefly projective 
transformations. If a € G is represented by a matrix p(a@) over C say, we have 


pla) p(B) = Ca. pp(a@B), 


where Cys € C” is a factor set, corresponding to an element of H -(G, K). 

As we have seen, from a short exact sequence of G-modules we can derive a long 
exact sequence by applying hom or the tensor product. This argument cannot be 
applied directly to an exact sequence describing a group extension 

A jd 
l>~N—->G—L- 1. (3.1.31) 
Nevertheless a similar result can be obtained on going via a short exact sequence of 


[-modules. It is given by 


Theorem 3.1.9. Given an exact sequence (3.1.31) describing a group extension G, there 
is a 5-term exact sequence for any L-module A: 


0 > H'(L.A) “> H(G.A) 2 Hom)(N*, A) —> Hi(L,A) “> HiG.A). 
(3.1.32) 


The map yp’, arising from yp, is called the inflation map and A*, arising from the 
inclusion A is called the restriction map. The connecting map t¢ is called the trans- 
gression. 


Proof. By tensoring (3.1.20) with ZL over G we obtain the exact sequence 
0 > Tor}'(ZL. Z) > ZL @; IG > ZL @, ZG > ZL @gZ = 0. 


Here the last two terms just represent the augmentation of ZL, so the kernel in the 
third term is IL. For the first term we have, by Theorem 2.7.9 and Proposition 3.1.3, 
Tor?(ZL. Z) & Tor’ (Z, Z) ¥ N*’. Hence we obtain the exact sequence of 
L-modules 


0—> N*® — ZL @; 1G > IL > 0. (3.1.33) 
From the associativity of the tensor product we know that for any L-module A, 
Hom, (ZL @; IG, A) = Homg(IG, A). 


Let us write the two sides of this formula as P(A) & Q(A) and write P”, Q” for the 
left derived functors. We take a short resolution of A with I injective: 


0-A>I1-C—2), 
and apply P, Q, recalling the P' vanishes on injectives: 
OPA) SP) ee PC) > PAY 0 
ee 
SONA) =O QC) =r QA) =O") 


3.1 Group extensions 101 


It follows that there is an injection from P(A) to Q(A), ie. from Ext(ZL @ IG, A) to 
Ext(IG, A). If we now apply Hom(—, A) to (3.1.33), we find 


0 — Hom, (IL, A) ~ Hom;(ZL ®¢ IG, A) > Hom,(N® A) 
— Ext; (IL, A) > Ext}(ZL @ IG, A). 


This reduces to (3.1.32) if we use (3.1.23) on the first two terms, replace the last term 
by Ext. (/G, A) and again use (3.1.23). | 


Occasionally the cohomology groups are needed for a more general coefficient 
ring. If K is any commutative ring and KG is the group algebra of G over K, then 
the standard resolution X —- Z — 0 on tensoring with K over Z becomes 


X@Q7zK—> K —- 0. 


and this is still a projective resolution, because the original resolution was Z-split, by 
the homotopy found earlier. Thus we obtain, for any KG-module A, 


Ext.(ZA) © Ext".(KG @g Z) & Ext".(K. A). (3.1.34) 
Similarly we have 
Tor2°(Z, A) & Tork°(K, A). (3.1.35) 


We recall that the terms in (3.1.34) vanish for n > 1 if A is coinduced, and those in 
(3.1.35) vanish if A is induced. But for G-modules over a finite group G, induced and 
coinduced mean the same thing, for if f : K —» KG is the inclusion map, then there is 
an isomorphism of KG-modules: 


Hom, (KG, U) = U@ KG, 


for any K-module U, given by ai> Ysa @s7'. 


Exercises 


1. Supply the details for the proof of Theorem 3.1.2. 

2. Examine the case of unnormalized factor sets. 

3. (O. Hélder) Let E be an extension of A by B, where A, B are cyclic groups of 
orders m, 1 respectively (such a group E is called metacyclic). Show that E has 
a presentation gp{a, bla" =1, bY =1, b-'ab=a‘}, where s=1 (mod m) 
and r(s — 1) = 0 (mod m). Conversely, given m, n, r, s satisfying these relations, 
show that there is a metacyclic group with this presentation. 

4. Show that an extension of A by G with factor set {r,.,} splits iff there exist 
C. € A such that mg. p= Cop (Cap ep. Let E be an extension of an abelian 
group A by G. By adjoining free abelian generators c, to E and using the 
above relations to define c,@; show that E can be embedded in a group E* 
which is a semidirect product of A* and G, where A* > A. 

5. Show that the group of isometries of Euclidean n-space is a split extension of the 
normal subgroup of translations by the orthogonal group. 


102 Further group theory 


6. Show that H'(G, A) is in natural bijection with the set of isomorphism classes of 
A-torsors, i.e. left A-sets with regular A-action such that (a.x)" = a%.x® 
(a € G.aeA). 

7. Show that the augmentation ideal, defined as in (3.1.20), is Z-free on the s — 1 
(1 #5 € G). Verify that Homc(IG, A) is isomorphic to the group of 1-cocycles. 

8. Using the mapping six%s—1 (mod IG’) of G into IG/(IG) show that 
G*> & IG/(IG)’. Deduce another proof of Proposition 3.1.3. 

9. Let G be a finite group of order r. Show that every cocycle in H"(G, C*) is coho- 
mologous to a cocycle whose values are r-th roots of 1. Deduce that the multi- 
plicator of G has order at most r”. 

10. Let G= C,,, the cyclic group of order m, with generator s and write D=s — 1, 


N=1+s+4+...+s"7!' in ZG. Verify that there is a free resolution 
W—->Z— 0, where W,,= ZG and do, : Wo, ~ W2,-) 18 d2, = N, while 
d>,,-, = D. With the notation yA = {a € AlaN = 0} for any G-module A 


show that H72"(G,A)=A°/AN, H?"-1(G,A)=nA/AD and H,,(G, A) = 
H"~'(G,A) for all n > 1. 


3.2 Hall subgroups 


One of the first results in group theory is Lagrange’s theorem, which tells us that the 
order of a subgroup of G is a divisor of the order of G. One soon discovers that the 
converse is false: for example, Alt, has order 12 but contains no subgroup of order 6. 
The first positive result in this direction was Sylow’s theorem, which showed that for 
prime powers the converse of Lagrange’s theorem is true. A significant generalization 
was found by Philip Hall, who showed in 1928 that in a soluble group G, for every 
factorization of the order of G into two coprime integers, |G] = mn, (m,n) = 1, 
there is a subgroup of order mm, and in 1937 he showed that solubility is necessary 
for this to happen. 

It is convenient to begin with some notation. Let z be a set of prime numbers. By 
the 2-part of an integer » = |] p** we understand the number nz = ||,,., p*’. The 
complementary set of primes is written 2’, so for any positive integer n we have 


a a | ae 


For any finite group G, z(G) denotes the set of primes dividing |G]. If x(G) C z, the 
group G is called a z-group. A Hall subgroup is a subgroup of G whose order is prime 
to its index. A m-subgroup of G whose index is prime to 7 (i.e. not divisible by any 
prime in 7) is called a Hall n-subgroup; e.g. when 1 = {p}, a Hall p-subgroup is just a 
Sylow p-subgroup. A Hall p’-subgroup is also called a p-complement. To establish the 
existence of Hall subgroups in soluble groups we shall need some preliminary results. 


Lemma 3.2.1. Let G be a finite group and H, K any subgroups. Then 
(HR PA) = (K oA TK). b3s251) 
In particular, if (G: H) = (K: HK), then HK =G. 


3.2 Hall subgroups 103 


Proof. Clearly HK is a union of cosets of H, say HK = Hk, U...U Hk,, where k; € K 
and r= (HK: H). We claim that, writing D = HK, we have the coset decom- 
position 


this will establish (3.2.1). Any x € K can by hypothesis be written as x = hk; where 
he H. Here h=xk7' € HONK =D; if hk; = h'k;, then kik, ' EH and it follows 
that i = j. This proves the coset decomposition (3.2.2), and (3.2.1) follows. Now if 
(G:H)=(K:HQOK), then (G:H)=(HK:H), hence G and HK contain the 
same number of cosets of H and so HK = G. a 


We note that in the special case where H or K is normal in G, the lemma follows 
from the second isomorphism theorem (Theorem 1.2.6). 


Lemma 3.2.2. If H, K are subgroups of a finite group G whose indices in G are coprime, 
then HK = Gand (G:HOK)=(G: H)(G: K). 
Proof. Put (G: H) = m, (G: K) = n; by Lemma 3.2.1 we have 
(G:HOK)=(G:K)(K:HNK)=(G: K)(HK:H). 
Clearly (HK : H) is a factor of m = (G: H), hence 
(G:HOK)=nm,, where m,|m. 
Similarly, 
(G:HOK)=mn,, where m1 |n. 


Hence #)n = m1, so n/n; = m/m, = 1, because m and n are coprime, and it 
follows that 


(G: HK) = mn. 
Moreover, (G: H) =m =m, = (HK : H), hence HK = G, by Lemma 3.2.1. | | 
We shall need the result that a normal Hall subgroup always has a complement. 


When the normal subgroup is abelian, this follows from Corollary 3.1.7; the general 
case can be deduced from this by an induction on the order: 


Theorem 3.2.3 (Schur—Zassenhaus). Let G, H be two finite groups. If the orders of G, 
H are coprime, then every extension of H by G splits. Moreover, if either G or H 1s 
soluble, then any two complements of H in an extension are conjugate. 


Proof. Write |G| = n.}|H| = 1, so that (n, r) = 1, and consider an extension 


es ee 


If H is abelian, the extension is represented by an element in H*(G, H) and this 
group is 0 by Corollary 3.1.7, so the extension splits. In the general case we use 
induction on r. Clearly we need only find a subgroup K of order n in E; this must 


104 Further group theory 


be a complement to H, because HM K = 1 by comparing orders, and so A|K maps 
onto G. 

Take a prime divisor p of r and let P be a Sylow p-subgroup of E; if its normalizer 
is N = Nz¢(P), then Ni(P) = HMN and the number of conjugates of P in N is 
(H:HOAN)=(E:N). 


or 

we 
| 
i! 


\ 


1) a 


It follows that n = (E:H) =(N:NMH). Now P and H — N are normal in N and 
N/P is an extension of (HMN)/P by N/(HMN), whose orders divide r and n 
respectively, and so are coprime. Since P 4 1, we can apply the induction hypothesis 
and find a subgroup K of N such that PC K CN and (K: P) = 14. 

Let Z be the centre of P; this is a non-trivial characteristic subgroup of P, hence 
ZaN and so Z<K. By induction we obtain a subgroup L of K such that 
ZCLCK and (L: Z) =n. Now L is an extension of the abelian group Z and 
the orders are again coprime, so we have a subgroup M of L of order n, and this 
is the required complement. 

If an extension E has two complements G). G2 of H, suppose first that H is abelian. 
An element x of G corresponds to xf), xf) = xf).c, of G;. Go, where c, is a cocycle of 
G in H. Since H'(G,H)=0, we have c. =u “u for some ué€H, hence 
xf. = xfj.u-“u =u 'xf,u and so G» is conjugate to Gj, as claimed. 

Next assume that H is soluble and let G;, G: be two complements. Applying the 
previous case to E/H’ we find that G:H'=(G,H')" for some ue H, ice. 
G\H' = GH’. By induction on the derived length of H there exists v € H’ such 
that Gi" = G) and the result follows. 

Finally assume that G is soluble and let G,. G2 again be complements of H in E. 
If G is a p-group, then G,, G) are Sylow p-subgroups of E and hence are conjugate, 
by Sylow’s theorem. In the general case let M, be a minimal normal subgroup of G, 
and M; the corresponding subgroup of G2. Then M,H = MoH and by induction 
there exists x € H such that M} = M2. Replacing G, by G} we may thus assume 
that M,; = M2. Now G,|H/M, = G2H/M) and by the induction hypothesis there 
exists y € H such that G|/M, = G2/M>; hence G, = G2 as we had to show. g 


3.2 Hall subgroups 105 


We remark that by the Feit-Thompson theorem every group of odd order is 
soluble. This shows that the hypothesis of Theorem 3.2.3 is always satisfied, since 
of two coprime integers at least one must be odd. 

Let 2 be any set of primes and let G be a finite group. It is clear that if H is a Hall 
m-subgroup of G, then H is also a Hall z-subgroup of any subgroup of G containing 
H, and under any homomorphism f from G, the image Hf is a Hall subgroup of Gf. 
Further, if p, q are distinct primes, H is a Hall p’-subgroup and K a Hall q’-subgroup 
of G, then |HK| is divisible by |G|,, and |G|,,, hence by |G| and so HK = G. More- 
over H 1 K isa Hall p’-subgroup of K, a q'-subgroup of Handa {p, q}’-subgroup of G. 

We shall also need an auxiliary result. For any finite group G, a minimal normal 
subgroup is a direct product of simple groups; we shall only need the case where G is 
soluble, when a minimal normal subgroup is a direct power of a cyclic group of 
prime order p, i.e. an elementary abelian p-group. 


Lemma 3.2.4. Let G be a finite soluble group. Then any minimal normal subgroup ts 
elementary abelian. 


Proof. Any minimal normal subgroup H of G is abelian, because its derived group is 
a proper subgroup and so must be the trivial group. Now H can have no character- 
istic subgroups and so must be a p-group for some prime p, and moreover all 
elements satisfy x? = 1, ie. it is elementary abelian, as claimed. | 


With these preparations we can establish the existence of Hall subgroups. 


Theorem 3.2.5 (P. Hall [1928]}. Let G be a finite soluble group and m a finite set 
of primes. Then any m-subgroup of G 1s contained in a Hall m-subgroup, hence Hall 
m-subgroups exist for any m, and any two Hall m-subgroups are conjugate in G. 


Proof. We shall use induction on |G|. Let G be a finite soluble group and M a mini- 
mal normal subgroup of G. By Lemma 3.2.4, M is an elementary abelian p-group, for 
some prime p. Write G = G/M; by the induction hypothesis G contains a Hall 
m-subgroup H, where H > M. Moreover, if A is a 7-subgroup of G, its image A in 
G is a m-subgroup and so is contained in some conjugate of H, say A C H*; 
follows that A C H*. If p ez, then (G: H), = 1 and H is a m-subgroup, wae a 
Hall z-subgroup of G, so is H* and A has been embedded in a Hall w-subgroup. 
Since 1 is always a m-subgroup, this shows that Hall z-subgroups exist. Moreover, 
all Hall w-subgroups are conjugate, for if A is a Hall 2-subgroup and A C H%, 
then A = H” because |A| = |AI. 

There remains the case p ¢ 1. By Theorem 3.2.3 there is a complement K of Min H 
and all complements are conjugate. Since |K| = |\H| = |G|_,, K is a Hall 2-subgroup 
of G, and all the Hall 2-subgroups are conjugate. If A is any z-subgroup of G, we 
have as before A C H* for some x € G. Either H C G, then by induction, A is con- 
tained in a Hall z-subgroup of H, which is also a Hall 2-subgroup of G; or H = G. 
Then A C G= KM and hence 


AM = AM KM =(AMK)M. 


106 Further group theory 


Now A and AM1K are Hall subgroups of AM and so are conjugate: At = 
AMNK CK(y€™M). It follows that A is contained in a Hall z-subgroup of G. 


For the converse we shall use an interesting solubility criterion due to Helmut 
Wielandt: 


Theorem 3.2.6. Let G be a finite group. If G has three soluble subgroups whose indices 
are pairwise coprime, then G 1s soluble. 


Proof. Let the subgroups be H,. H). H3; if H,; = 1, then |G| = (G: H)) is prime to 
(G: H>), so the latter is 1 and Hy = Gis soluble. Hence we may assume that H, $ 1. 
Let M be a minimal normal subgroup of Hj; since H is soluble, M is an elementary 
abelian p-group, for some prime p. Now p cannot divide the indices of both H2 and 
H3, say it is prime to (G: Hy). Then p||H>|, hence H> contains a non-trivial Sylow 
p-subgroup P, which is also a Sylow p-subgroup of G. Let P; be a Sylow p-subgroup 
of H,; then P; C P* for some x € G. We may replace H2 by H} without affecting the 
hypothesis; then P; C P and M C P, because M <a H;. Thus we have M C H; MN Hp. 
Now by Lemma 3.2.2, G = H,H>; hence any x € G has the form x = x)x2, x; € Hj; 
therefore M* = M*? C H>. Hence H> contains the normal closure of M in G: 
K = M° = gp{M"ly € G} © Hp. Since H) is soluble, so is K. Further, the subgroups 
KH,/K of G/K satisfy the hypothesis, so G/K is soluble, hence so is G. gi 


We can now complete the proof of Hall’s criterion; we recall that a p-complement 
is a subgroup of p-power index and order prime to p; in particular, if p does not 
divide the order of G, then G itself is the only p-complement. 


Theorem 3.2.7( P. Hall [1937]). Any finite group is soluble if and only if it contains a 
p-complement for each prime p. 


Proof. For soluble groups the result follows by Theorem 3.2.5. Now assume that 
\G| = p,'...p“, where the p, are distinct primes, and that G has a p;-complement 
H, fori =1,...,r. Ifr = 1 or 2, Gis soluble by Burnside’s p%q’-theorem (Theorem 
6.8.3 below), hence we may assume r > 3. We claim that each H; is soluble. Clearly 
(G:H,) = p,’, hence by Lemma 3.2.2, H; MH; is a Hall 2;-subgroup of G, where 
mw, = m(G)\{p), p;}, and it follows that H, %H, is a p;-complement of H,. By induc- 


tion H, is soluble, similarly for H2...., H,; now we apply Theorem 3.2.6 to complete 
the proof. | 
Exercises 


1. By a Hall system in a finite group G is meant a family }~ of Hall subgroups of G, 
one of order d for each divisor d of |G| prime to |G|/d, such that each H € >> is 
the intersection of all subgroups in 5° whose orders are divisible by |H|. Verify 
that every family of p-complements, where p runs over the prime divisors of 
|G|, gives rise to a Hall system, and that for any H,K € )°, HK = KHe).. 


3.3 The transfer 107 


2. Find all Hall systems in Sym,. 

3. Show that Lemma 3.2.1 holds for any group G such that the subgroup H is of 
finite index. Extend Lemma 3.2.2 similarly. 

4. Let G be a finite group and P, a Sylow p,-subgroup, for each prime divisor p; of 
IG| (t= 1,....r). Show that if P;P;) = P)P; for i,j; = 1,...,1r, then G is soluble 
and the different products of the P; constitute a Hall system. Show that G is 
nilpotent iff it has a single Hall system. 

5. Show that if G is a finite soluble (non-trivial) group, then for some prime p there 
is a non-trivial abelian normal p-subgroup of G. (Hint. Examine the last link in 
the derived series for G.) 

6. Let G bea finite soluble group and pa prime divisor of |G|. Show that the number 
of p-complements in G is a power of p. Use Lemma 3.2.2 to deduce that all Hall 
systems in G are conjugate. 

7. Let G be a finite soluble group. Show that every subgroup containing the normal- 
izer of a Hall subgroup is its own normalizer. 

8. By examining Sym; show that Theorem 3.2.6 does not remain true when the 
number of subgroups is reduced to 2. 

9. Show that in any finite group each minimal normal subgroup is a direct product 
of isomorphic simple groups. 


3.3 The transfer 


Unlike rings, groups have only one binary operation, which makes it harder to form 
constructions like the determinant. Even the latter depends on the commutativity of 
the ring for its definition; later, in Section 9.2, we shall see that determinants can be 
defined over a skew field, to take values in an abelian group, and it turns out that 
there is an analogous construction for groups, the transfer, which we shall describe 
now, with some applications. 

To define it, take any group G and a subgroup H of finite index, n say, and 
consider a fixed decomposition of G: 


G=Hs, UHs.U...UHs,,. (3.3.1) 


We can represent G by matrices as follows: Each a € G permutes the cosets in (3.3.1) 
by right multiplication, thus there is a permutation o, of 1,2,..., such that 
Hs,a = As... Hence sas, € H and we can define an nxn matrix p(a) = 
(44,,(a)) with entries in H by the equation 

sas. 


id, 


if jf = 105: 
Li; ,(a) = 
0 otherwise. 


Thus yz(a) has a single non-zero entry in each row and one in each column; it is 
called a monomial matrix over H. To show that we have indeed a representation, 
take a, b € G; we have o,) = 0,0» because o is a permutation representation of G 


108 Further group theory 


on the cosets (3.3.1). Hence we have 5,45, | Sia, bsi, ,, = silab)s;,, € H and it follows 
that 


Na 10,10 b if k —7 bs 
i;(ab) = i (a) FA at ) l 10 a} 


otherwise. 


Thus j4;,.(ab) = > [4;;(@)U4j.(b), and we have indeed a representation. Of course the 
representation still depends on the choice of transversal (s,); to free ourselves of this 
dependence, consider a homomorphism f : H — A to an abelian group A (written 
multiplicatively) and put ajj(a) = w,;)(a)f. We claim that the mapping V:G—A 
defined by 


FI 


aV = |det a(a)| = [| (sjas;, ' )f (3.3.2) 


ti 


is ahomomorphism independent of the choice of transversal. For if t; = hjs; (hi € H) 
is another transversal and we form the corresponding expression, we obtain 


| (hisias;, hy f= I] (h;f) I] (sas \f ( | h, f)~ — av. 


a comes | i=] je 


because o is a permutation of 1,..., n. Now the homomorphism property follows 
from the multiplication law for determinants. This mapping V is called the transfer 
of G into A (German: Verlagerung). We remark that ker V > G’, so that we can 
factor V by the natural map from G to G* = G/G’, to obtain a homomorphism 
from G*° to A. 

Let us examine the action of a€ G on G/H more closely. Suppose that G/H 


consists of r orbits of m, points (7 = 1,...., r) under the action of gp{a}, so that 
Yin; =n=(G:H). If we take our transversal in the form ta! (i=1....,74, 
j=0.1....,2—1), then the action on the 7-th orbit is represented by the cycle 
(t;, t,a,....t;a'~+), and its contribution to the transfer is tia’'t.— ' So we have 
; 
aV =| | (tat ')f, where tat ' € H. (3.333) 


We shall apply the transfer to prove the existence of normal p-complements under 
suitable conditions. Here we need 


Lemma 3.3.1. Let G be a finite group, P a Sylow p-subgroup of G and X, Y two subsets 
of G normalized by P and conjugate in G. Then X and Y are conjugate in Nc(P). 


Proof. By hypothesis Y = X° for some b € G, and X, Y are normalized by P, hence 
Y = X° is normalized by P». Thus N = N,(Y) contains P and P?; clearly they are 
Sylow subgroups of N and by Sylow’s theorem they are conjugate in N, say 
P’ —P for ce N. Writing a=bc, we have ae No(P), hence X*=X* = 
Y‘' =Y, because cE N. Fs | 


3.4 Free groups 109 


Theorem 3.3.2 (Burnside). Let G be a finite group and suppose that the Sylow 
p-subgroup P of G is contained in the centre of its normalizer. Then P has a normal 
complement in G. 


Proof. By hypothesis P is abelian, so we can take the transfer V : G — P. For any 
ué€ P and n,,t; as in (3.3.3) (for H = P), uw and tu" i lie in P, are normalized 
by P and are conjugate in G. By the above lemma they are conjugate in Ng(P), 
hence equal (because P lies in the centre of N:;(P)), so we obtain 


tut. ea ae 


It follows that wV = u”, where n = (G: P). Since p is prime to (G: P) and P is an 
abelian p-group, it follows that V : G > P is surjective and the kernel is of index |P}, 
so it is the desired complement. ia 


Corollary 3.3.3. Let G be a finite non-abelian group with a non-trivial cyclic Sylow 
2-subgroup. Then G has a normal 2-complement and so cannot be simple. 


Proof. Let P, of order 2%, be a Sylow 2-subgroup of G. Since it is cyclic, its auto- 
morphism group has order y(2”) = 2%~!, therefore (No(P): P) is a factor of 2°, 
but it also divides (G: P), which is odd, hence N,;;(P) = P, so the hypothesis of 
Theorem 3.3.2 is satisfied and we obtain a normal complement of P. & 


Exercises 


1. In the definition (3.3.2) the sign of the determinant was ignored. Show that taking 
the sign into account only amounts to taking the sign of the permutation repre- 
sentation of G on G/H. 

. Show that the corestriction mapping H,(G,Z) — H\(H.Z) (induced by the 
restriction from G to H) is just the transfer (see also Section 5.6). 

3. Show that if G has an abelian Sylow p-subgroup P with normalizer N, then 
PNG =PNN' and P=(PNN’') x (PNZ(N)). Show also that the maximal 
p-factor group of G (i.e. the maximal quotient group which is a p-group) is 
=PMZ(N). 

4. Show that if a Sylow p-subgroup P of G has trivial intersection with any of its 
distinct conjugates, then any two elements of P which are conjugate in G are con- 
jugate in N,;(P). 

5. Let P, Q be two distinct Sylow p-subgroups of G, chosen so that (for fixed p) 
PMQ has maximal order. Show that the only conjugates of P/Q in G that 
are contained in P are conjugate to PM Q in Nc(P). 


tv 


3.4 Free groups 


Since groups form a class of algebras defined by laws (Section 1.3), it is clear what is 
to be understood by a free group on a given set. But a peculiarity of the laws defining 


110 Further group theory 


groups allows the elements of a free group to be written in an easily recognized form, 
that we shall now describe. 
Let X be any non-empty set. By a group word in X we understand an expression 


HW) Ur...U,,n > O, (3.4.1) 


where each u is either an element x or x~' for some x € X. Formally we can regard 
the expressions (3.4.1) as the elements of the free monoid on X UX~7', where we 
have put X~! = {x7 !'|x eX}. For 1» =0, (3.4.1) reduces to the empty word, 
written 1. We define an elementary reduction of a word u,...u, as the process of 
omitting a pair of neighbouring factors of the form xx~' or x7~'x, for some 
x € X. The inverse process, inserting a factor xx~' or x 'x at some point, is an 
elementary expansion. A word is said to be reduced if it contains no pairs of neigh- 
bouring factors xx~' or x~!x; thus a word admits an elementary reduction precisely 
if it is not reduced. Two group words in X are said to be equivalent if we can trans- 
form one into the other by a series of elementary reductions and expansions. 

It is clear that in any group G generated by a set X, two equivalent group words 
represent the same element. When there are relations in G between the group words 
in X, there will be inequivalent group words representing the same group element, 
but as we shall see, in a free group inequivalent groups words represent distinct 
group elements. 


Theorem 3.4.1. Let X be a set and F the free group on X. Then every element of F is 
represented by exactly one reduced group word in X and two group words represent the 
same element of F if and only if they are equivalent. 


Proof. If X = ©, F is the trivial group consisting of 1 alone, so we may assume that 
X is not empty. It is clear from the definitions that every element of F is represented 
by a group word in X, and that equivalent group words represent the same element. 
The multiplication of group words by juxtaposition is associative, and it is easily 
checked that the equivalence class of a product depends only on those of the factors, 
not on the factors themselves. Further, the empty word | acts as neutral under multi- 
plication. Hence the set of equivalence classes forms a monoid under multiplication, 


and this is in fact a group, since u,;u....u, has the inverse u7! ...U;, where 
i So Si Sa, Sa ae: 
ay 23 zi = 
Up etile.. wtele “OM Pe a sa ea Hy, rine ee Le 


] 


by elementary reductions. It only remains to show that each group word is equiva- 
lent to exactly one reduced group word. Given a group ward (3.4.1), we apply 
elementary reductions as often as possible; each such reduction reduces the length, 
sO we arrive at a reduced form after a finite number of steps. In order to show 
that this form is independent of the order in which the reductions are made, we 
shall use the diamond lemma (Lemma 1.4.1). We have to show that if a word f is 
reduced to g). g> by different elementary reductions, then there is a word h which 
can be obtained by reduction from g; as well as g>. There are two cases: (i) The 
terms being reduced do not overlap, say u,uj.; =xx~', ujt.; = yy ', where 
j>itland x,y¢XUX~!, and of course (x~!)~' =x. Clearly these reductions 


3.4 Free groups 111 


can be performed in either order and the outcome will be the same. (ii) The terms 
overlap; then a subword uwu7 !u or u~!uu7! occurs. Taking uu” 'u, we can either 
reduce uu ~' to 1 or u/u to 1, and we are left with u in each case, so again the 
outcome is the same, and similar reasoning applies to u~ 'uu~'. Now it follows 


by Lemma 1.4.1 that we have a unique reduced form. + | 


Here is another proof, not using the diamond lemma. The essential step is to show 
that distinct reduced words represent distinct group elements. We define F as a 
permutation group on the set W of all reduced group words as follows. Given 
W = U,...Uu, € W and x € X, we put 


Wicict et At tye orn 0, 
Wy = 

Uj s.sHyoy if u,=x-'. 

Wicd Ab ee eor w= 0; 
wBy = 

Te ee, 1 le rea 


It is easily checked that a, is a permutation of W with inverse f,. Thus F has been 
defined as a permutation group on W. Now, given any reduced word u, ... 1, if we 
apply @,, ...a@,, (where a, 1 = f,), to the empty word, we find 1, ... 1, hence dis- 
tinct words define distinct permutations of W and so represent different elements of 


F. as 


A free generating set in a free group will also be called a basis. Let F be the free 
group with basis X = {x}... x,} and denote by F~ the subgroup generated by all 
squares. Then F/F~ is the elementary abelian 2-group generated by the images of 
Lo eee x, and hence is of order 2‘. In particular this shows that the number d of 
elements in a basis is independent of the choice of basis. It is called the rank of F, 
written rk F, and the above argument shows that free groups of different ranks 
cannot be isomorphic (see also Further Exercise 1 of Chapter 1). 

We remark that by the results of Section 1.3 every group G can be written as a 
homomorphic image of a free group; more precisely, if G can be generated by d 
elements, then it can be expressed as a quotient of a free group of rank d. 

We note that Theorem 3.4.1 solves the word problem for free groups, for it pro- 
vides a means of deciding when two group words (in a given free generating set) 
represent the same group element. Let us now look at the conjugacy problem and 
show that in free groups this can be solved in a similar manner. 

Two group words f, g are said to be cyclic conjugates if f = uv, g = vu for suitable 
words u, ». Thus xy7 27 ‘xy and xyxy7!'z7! are cyclic conjugates. We remark that 
even when a word is in reduced form, it may have a cyclic conjugate which is not 
reduced, e.g. x ‘yx. A group word is said to be cyclically reduced if all its cyclic 
conjugates are reduced. Now we have 


Proposition 3.4.2. Let F be a free group ona set X. Then every element of F has a cycli- 
cally reduced conjugate and two elements of F are conjugate if and only if their cyclically 
reduced forms are cyclic conjugates. 


112 Further group theory 


It is clear that one can check in a finite number of steps whether two reduced 
words are cyclic conjugates; for words of length n we need only compare n different 
words. 


Proof. Let f, g € F; if their reduced forms are cyclic conjugates, then f = uv, g = vu 
and hence g = u~ 'fu. Conversely, suppose that f| g are conjugate. By passing to 
appropriate conjugates we may suppose that both f and g are in cyclically reduced 
form. Let 


g=u 7 fu, (3.4.2) 


where u is in reduced form. Since g is cyclically reduced, the right-hand side cannot 
be in reduced form, say u and f begin with the same letter x. Then f cannot end in 
x | (because fis cyclically reduced), so no cancellation can take place between f and 
u, and the only way to reduce the right-hand side of (3.4.2) is by cancelling an initial 
portion of u against that of f So we have either u =u), f = uf\, hence 
u~'fu = uy | fru,u2; this is reduced and so must equal g; but the latter is cyclically 
reduced, so uz = 1, uw) = uand f = uf\, g = fru, showing f, g to be cyclic conjugates. 
Or we have u = fu2; then u~'fu = uy ' fu2 and the argument shows as before that 
u» = 1; now it follows again that f, g are cyclically conjugate. B 


We conclude this section by proving the Schreier subgroup theorem, first proved 
by Otto Schreier in 1927; this is an important result in its own right, while the proof 
illustrates some of the methods used in the study of free groups. 

Given any group G with generating set X and a G-set M, we define its diagram 
(G,X.M) as the graph whose vertices are the points of M, with an edge from p to q 
whenever q = px for some x € X. More precisely, the diagram may be regarded as 
a directed graph (digraph), in which each x € X is represented by a directed edge 
and x~' by its opposite. 

For example, the symmetric group Syms, with generating set {(12). (23)} acting on 
the set {1, 2, 3} has the diagram shown below; the second diagram represents Sym; 
acting on the set of all arrangements of 1, 2, 3. In both cases continuous lines repre- 
sent (12) and broken lines (2 3). 


123 > 


321 213 _ 


3.4 Free groups 113 


We note that the second diagram may be regarded as the diagram of the regular 
representation (G, X, G); such a diagram contains all the information contained in 
the multiplication table, but in a more accessible form. For example, we can from 
the diagram read off the relation [(12)(23)]° = 1, corresponding to a circuit 
based at 123. 

It is easily seen that the diagram of (G,X,M) has a connected graph iff G acts 
transitively on M. When this is so, we can take M to consist of the cosets in G of 
a stabilizer of a point, and from this point of view a diagram may be called a coset 
diagram. Any path in such a coset diagram (G, X.G/H) represents an element of 
G applied to a certain coset; in particular any element of H may be represented by 
a circuit beginning and ending at the coset H. Since the graph is connected, it 
contains a subgraph on all the vertices which is a tree (BA, Theorem 1.3.3), i.e. a con- 
nected graph without circuits. Such a tree including all the vertices of our graph is 
also called a spanning tree for the graph. Let us express this in terms of our group. 
Given any coset Hu, there is a path in the coset diagram from H to Hu (where 
the orientation in forming the path is disregarded). Each such path corresponds 
to a group word w on X such that Hw = Hu. Now the choice of spanning tree in 
our graph amounts to the choice of a particular representative w= w,...w, 
(1; € X UX~') for our coset. It has the property that any left factor w,...w; of 
w is the chosen representative of this coset. A transversal with this property 1s 
called a Schreier transversal. So the choice of a spanning tree amounts to choosing 
a Schreier transversal, and it is not hard to verify that any Schreier transversal in 
turn leads to a spanning tree. Therefore the existence of a spanning tree assures us 
that every subgroup of G has a Schreier transversal, relative to the given generating 
set of G. We note that neither X nor (G: H) need be finite for the spanning tree to 
exist. 

In any group G with generating set X and a subgroup H consider a coset diagram 
(G, X, G/H) and choose a spanning tree I. Any edge @ in our diagram, from p to q 
say, gives rise to a circuit as follows. Denote by py the vertex corresponding to the 
coset H. There is a unique path w, within I from py to p, and a unique path w, 
from q to po; hence w aw’ is a circuit on py. Here we have w2 = (w,@) a precisely 
if w € T, so our circuit is trivial (reduced to py) precisely ifa € T. If wa ZT, so that 
the circuit is non-trivial, it will be called a basic circuit, corresponding to a. As we 
saw earlier, any circuit on pp corresponds to an element of H. We now observe 
that any circuit on py can be written as a product of basic circuits. For, given 
edges a},....@, forming a circuit on py, suppose that a, goes from p;_; to p,, 
where p, = py, and let w; be the path within [ from pp to p;3 in particular, wo is 
the empty path. 

Then we have 


Q,..., = Aw, 'Wwidawy W see ww, |, (3.4.3) 


Here wj_ )@,w, ' either is trivial and so can be omitted, or it is basic. This expresses 
our circuit as a product of basic circuits, and it shows that H is generated by the 
elements corresponding to basic circuits. Thus we have 


114 Further group theory 


Proposition 3.4.3. Let G be a finitely generated group. Then any subgroup of finite 
index is finitely generated; more precisely, if G has a generating set of d elements and 
(G: H) =m, then H can be generated by m(d — 1) + 1 elements. 

It only remains to prove the last part. The coset diagram (G,X,G/H) has m 
vertices and dm edges; by BA, Theorem 1.3.3, a spanning tree has #1 — 1 edges. 
We saw that H has a generating set whose elements correspond to the non-tree 
edges, and their number is dm — (m— 1) = m(d—1) +1. Pe | 


By a slight refinement of the argument we obtain Schreier’s subgroup theorem: 


Theorem 3.4.4. Any subgroup of a free group is free. If F 1s free of finite rank d and H ts 
a subgroup of finite index m in F, then H has rank given by 


rkH =(F:H)(rk F—1) 41. (3.4.4) 
Equation (3.4.4) is known as Schreier’s formula (see also Section 11.5 below). 


Proof. We take the coset diagram of (F, X. F/H) and choose a spanning tree I”. As 
we saw, H is generated by the elements corresponding to basic circuits; our aim is to 
show that these elements generate H freely. For each non-tree edge @ we have a basic 
circuit @ and it will be enough to show that if a ...@, is a reduced word ¥ 1 in the 
non-tree edges, then 


@,...@, #1, (3.4.5) 


where @, = w;_,a;w, ' as before. If we write out the left-hand side of (3.4.5) as a 
word in X, we find that although there may be cancellation, this cannot involve 
the a,; they cannot cancel against any part of any w; because the a; are not in T, 
and they cannot cancel against each other because the original word @,...a@, was 
reduced. Hence (3.4.5) follows and this shows the basic circuits to be a tree generat- 
ing set of H. Now (3.4.4) follows as before by counting the non-tree edges. oi 


Finally we shall prove that free groups are residually finite p-groups, for any 
prime p. We recall from Section 1.2 that this means: given any element c # 1 ina 
free group F, there exists a normal subgroup N of F not containing c, such that 
F/N is a finite p-group. This will also show that for any prime p, F can be expressed as 
a subdirect product of finite p-groups. 

Let F be the free group on x),....x,4 and consider a non-trivial word 


React eee tects. Wee Oe. zed. (3.4.6) 


: q I, 


Let m be a positive integer so large that p” does not divide a, ...a@, and take G to be 
the group of all upper unitriangular matrices of order r+ 1 over Z/p”, i.e. matrices 
of the form I + N, where N = (n,,), n,, = 0 fori > j. It follows that N'*! = 0, hence 
(I+N)? =I+N? =I for p'>r+1,; this shows G to be a finite p-group. Our 
object will be to find a homomorphism F — G such that (3.4.6) is not mapped 
to 1. We map x, to A,, where 


A; =| [+ ee+1). 


k 


3.4 Free groups 115 


where the product is taken over all k such that 1% =1. Then we have eA;, = 
e, + €-. 1, where the e are unit row vectors; hence 


ua a, 
eA, . AS Se) +O)... pepe t..., 


i, 


where the final dots indicate terms in e@.,...,e,. We have thus found a homo- 
morphism of the required form; we remark that G does not depend on the rank d 
of F, and this may be taken to be infinite. We thus obtain 


Theorem 3.4.5. The free group (of any rank) 1s residually a finite p-group, for any 
prime p. > | 


An interesting consequence is due to Wilhelm Magnus. We recall that for any 
group G, the lower central series is defined recursively as y;,(G) = (G, y,-\(G)), 
yi(G) = G, where (G.H) denotes the subgroup generated by all commutators 
(x.y) =x ly lxy (xe Gy € A). 


Corollary 3.4.6 (W. Magnus). In a free group F, O,y,(F) = 1. 


Proof. Suppose that 9,y,(F) #1 and take a word c #1 in the intersection. By 
Theorem 3.4.5 there is a normal subgroup N not containing ¢ such that F/N is a 
finite p-group (for some p). Since F/N is nilpotent, of class h say, we have 
¥n+1(F) CN and soc ¢ y;,.,(F), which is a contradiction. + | 


For another proof of this corollary see Further Exercise 15. 


Exercises 


1, Give a direct proof that every abelian subgroup of a free group is cyclic. 

. Ina free group, if w= u", where n > 1, wis called a root of w, primutive if n is 
maximal for a given w. Show that every element of a free group has a unique 
primitive root. Show that two elements u, vy 4 | have a common root or inverse 
roots iff they commute. 

3. Let F be a free group. Show that every subgroup of finite index meets every sub- 

group | non-trivially. 

4. Define a group G to be projective (for this exercise only) if any homomorphism 

G — Hj, where H, is a quotient of a group H, can be lifted to a homomorphism 
G — H. Show that a group is projective iff it is free. (Note that this allows one to 
define free groups without reference to a basis.) 

5. Let G be any group and X a subset such that X MX ~' = @ and any non-empty 

product of elements of X UX ~! with no factors xx~' or x 'x is 4 1. Show that 
X is a free generating set of the subgroup generated by it. 

6. Show that a free group of rank d cannot be generated by fewer than d elements. 

7. A group is called Hopfian (after Heinz Hopf) if every surjective endomorphism 

of G is an automorphism. Show that every free group of finite rank is Hopfian. 

8. Show that in a free group of rank 3 any subgroup of finite index has odd rank. 


t 


116 Further group theory 


9. In the free group on x, y, z find a basis for the subgroup generated by all the 
squares. 

10. Show that in the free group on x, y, the elements y ~~ "xy" form a free generating 
set. Deduce that any non-cyclic free group contains a free subgroup of countable 
rank. 

11. Show that in any free group F of rank > 1, the derived group F’ has countable 
rank, 

12. Let G be a group with generating set X and H a subgroup. Show how to con- 
struct a Schreier transversal and verify that it corresponds to a spanning tree 
of the graph (G. X, G/H). 


3.5 Linear groups 


In Section 3.1 we saw how in principle at least every finite group can be built up from 
simple groups. This leaves the problem of determining the finite simple groups, and 
it was only around 1980 that the classification of the simple finite groups was com- 
pleted. The full list contains 18 families (including the cyclic groups of prime order) 
and another 26 ‘sporadic’ groups. This is not the place to give a detailed account (see 
Gorenstein (1982)), but we shall present some of the families that have been known 
since the work of Leonard Eugene Dickson (1901). It is well known that the alternat- 
ing group Alt, is simple for n > 5. In this section we shall describe the linear groups 
and in the next two sections we deal with the symplectic and orthogonal groups, 
collectively usually known under the name of classical groups since the time of 
Hermann Wey! (1939), These families of groups can be shown to correspond to 
certain infinite families of simple Lie algebras, though we shall not do so here. It 
was Chevalley’s paper [1955], describing how these families of Lie algebras could 
also be defined over finite fields, that provided the impetus for research in the 
1960s and 1970s that led to the classification of the simple finite groups. 

Let k be any field and V an n-dimensional vector space over k. The automorphisms 
of V form a group whose members may be described, after the choice of a basis in V, 
by invertible n x n matrices over k. This is the general linear group of degree n over k, 
written GL(V) or GL,,(k). The determinant function provides a homomorphism 
from GL,,(k) to k* whose kernel SL,(k) is the special linear group, consisting of all 
nxn matrices over k with determinant 1. We remark that the general linear 
group may be defined more generally over any ring R as GL,,(R), the set of all 
invertible n x m matrices over R. When R is commutative we also find the subgroup 
SL,,(R) as before, but this definition breaks down for more general rings. Later, in 
Chapter 9, we shall find a way of defining SL,,{K) for a skew field K, but for the 
present we shall mainly be concerned with a commutative field k as ground ring. 

We begin by finding generating sets for GL,,(k) and SL,,(k). We denote by E;; the 
matrix with (7. 7)-entry 1] and all others zero. By an elementary matrix we understand 
a matrix of the form 


By (a) =1+ aE; whereae kif yj. 


3.5 Linear groups 117 


{t is clear that this matrix lies in SL,,(k), with inverse B;,( — a). A diagonal matrix 
>, 4,E;; lies in GL,,(k) iff | [ a; 4 0, while [].a; = 1 is the condition for it to belong 
to SL,,(k). 

Taking V again as our vector space over k, let 1 #4 0 be a vector in V. By a trans- 
vection along u we understand a linear mapping T of V keeping u fixed and such that 
xT — x is in the direction of u. Thus we can express T as 


xT=x+AX(x)u, 


where A is a linear functional on V such that A(u) = 0. For example, with the stan- 


dard basis e€,,.... e, in k", By,(a) maps x = )_ x,e; to x + axe, and so is a trans- 
vection along e, (for n > 1). Given a transvection T along u, let us choose a basis 
M,.....M, Of T such that uw) = u while u2.....u,, forms a basis for the kernel of 


T — I. Then T takes the form xT = x + A(x)u;, so T is represented by an elementary 
matrix relative to this basis. 

For our first result we take our ring to be a Euclidean domain; we recall (from BA, 
Section 10.2) that a Euclidean domain is an integral domain in which the Euclidean 
algorithm holds (relative to a norm function). 


Theorem 3.5.1. For any (commutative) Euclidean domain R and any n> 1, the 
special linear group SL,(R) ts generated by all elementary matrices and the general 
linear group GL,(R) by all elementary and diagonal matrices with units along the 
main diagonal. In particular, this holds for any field. 


Proof. The result is clear for 7 = 1, so assume n > 1. Let A € SL,,(); it is clear that 
right multiplication by B;;(c) corresponds to the operation of adding the :-th 
column, multiplied by c¢, to the j-th column. Now the Euclidean algorithm shows 
that multiplication on the right alternatively by B,.(p) and B2,(q) for suitable 
p.q € R reduces a), a)2 to d, 0 respectively, where d is an HCF of a), and a)>. By 
operating similarly on other columns we reduce all elements of the first row after 
the first one to zero. We next repeat this process on the rows, to reduce all elements 
in the first column after the first one to zero, by multiplying by suitable elementary 
matrices on the left. This either leaves the top row unchanged or it replaces the 
(1. 1)-entry by a proper factor. By an induction on the number of factors we 
reduce A to the form a @ A,, where A, is (1 — 1) x (# — 1), and another induction, 
this time on n, reduces A to diagonal form. Using the factorization 


baer het alls at ihe ale he 


we can express diag(a@),@2....,a,) by diag(1.a,a).aa,....a,) and a product of 
elementary matrices, and hence by another induction express A as a product of 
elementary matrices. 

The same process applied to a matrix of GL,,(R) reduces it to a diagonal matrix 


and this proves the second assertion. Fs | 


118 Further group theory 


For any commutative ring R the determinant may be regarded as a homomorph- 
ism from GL,,(R) to U(R) and it yields the exact sequence 


te SL ARS GL, (Ry UR) ST (335.1) 


Let us recall that a group G is called perfect if it coincides with its derived group G’. 
Since R is commutative, it follows that GL,,(R)’ C SL,,(R); for fields we have the 
following more precise relation: 


Proposition 3.5.2. Let k be any field and n > 2. Then 
SL ik): = GL,,(k) = SL,,(k). (O52) 


except when n = 2 and k consists of 2 or 3 elements. Thus SL,(k) is perfect (with the 
exceptions listed). 


Proof. As we have remarked, we have SL,,(k)’ © GL,(k)’ C SL,,(k). To establish 
equality it is enough, by Theorem 3.5.1, to show that every elementary matrix is a 
product of commutators. It is easily checked that for any distinct indices i, j, k, 


(By (a), By (1)) = Bil — a) By — 1) Bi(a)Byi(1) = Bjj{a). 


This expresses every elementary matrix as a commutator when n > 3. For n = 2 we 


have 
((", ’) (| C ) (’ (P= le 
= ; (3.5.3) 
0 b 0 l/s. 0 ] 


hence if k contains an element b such that b £4 0, b> # 1, then we can express any 
elementary matrix B,.(a) as a commutator by taking c = (1 — b*) ‘a in (3.5.3), 
and similarly for B.,(a). This shows that SL,,(k)’ = SL,,(k) except when n = 2 and 
b* = b for all b € k, and this happens only when |k| = 2 or 3. «| 


We shall see that SL,(F.) and SL:(F;) are soluble (see Exercise 1), so the excep- 
tions made in Proposition 3.5.2 do in fact occur. 

Our next aim is to show that SL,(k) modulo its centre is simple (except when 
n= 2 and {k| <3). By taking k to be finite we thus obtain a family of finite 
simple groups. The centre of GL,,(k) or SL,,(k) clearly consists of all scalar matrices 
and the quotients are known as the projective linear groups, written 


PGL,,(k) = GL,(k)/Z. PSL, (k) = SL,,(k)/(Z 0 SL,,(k)). 


where Z is the centre of GL,,(k). Let us define projective n-space P'"(k) as the set of all 
(+ 1)-tuples x = (xp, x),....%,) € k"*', where the x, are not all 0 and x = y iff 
x, = Ay, for some A © k*. Thus the points of P"(k) are given by the ratios of 
n+ 1 coordinates. It is clear that PGL,,.,(k) and PSL,,.,(k) act in a natural way 
on the points of P"(k). 

To prove the simplicity of PSL,, (following Iwasawa) we shall need to recall the 
notion of primitivity and derive some of its properties. We recall that a permuta- 
tion group G acting on a set S is transitive if for any p.q € S there exists g € G 


3.5 Linear groups 119 


such that pg =q, thus S consists of a single G-orbit. The stabilizer of p € S is 
Stp = {x € Gipx = p}, a subgroup of G. When G acts on S, it acts in a natural way 
on S" for n > 1; if this action is transitive, G is said to be n-fold transitive. The 
G-action on S is called primitive if G acts transitively and there is no partition of 
S (other than the trivial ones consisting of S alone and of the l-element subsets) 
which is compatible with the G-action. If T is another G-set with a map 
f :S— T such that (pg) f = (pf)g (p € S. g € G), ie. f is compatible with G, then 
it is easily seen that the action on S is primitive iff there is no compatible map 
from S onto a G-set with more than one element, which is not injective. It is also 
not hard to show that G is imprimitive precisely when S contains a proper subset 
So with more than one element such that for each g €G either Sog = So or 
Sog A So = G. For example, any 2-fold transitive group is primitive, for if 
p.q € So then there exists g € G with pg € So, qg ¢ So, hence Sog meets Sp but is 
not equal to it. 

It follows that for n > 2 the action of PSL,,(k) on P” '(k) is primitive, since it is 
doubly transitive. To establish this result, take two distinct pairs of points in 
P"~'(k). p,.q, (i= 1.2) with coordinate vectors x,,y; respectively, so that x), x> 
are linearly independent, as well as 9.3; then there exists A € SL,(k) such that 
x,A = ¥}, XA = cy2 for some c € k*, hence A maps p, to q, fori = 1. 2. By the earlier 
remark it follows that PSL,,(k) is primitive. 

We shall need a couple of lemmas giving conditions for primitivity: 


Lemma 3.5.3. Let G be a permutation group acting on a set S and denote the stabilizer 
of p € S by Sty. If G ts transitive, then it is primitive if and only if St, is a maximal 
proper subgroup of G. 


Proof. Assume G to be imprimitive and let f : S — T be a compatible mapping onto 
another G-set which is neither injective nor a constant mapping. Take p.q«€ S 
such that pf =qf and denote the stabilizer of pf by L. If pg =p, then 
(pf lg = (pg)f = pf, hence St, CI CG. By transitivity there exists h € G such 
that ph = q, hence (pf )h = (ph) f = qf = pf. This shows that h € L, but h ¢ Sty, 
so Stp C L and L is a proper subgroup of G, because f is non-constant. Thus Sty is 
not maximal. 

Conversely, if St) C K C G for some subgroup K of G, we may regard S as the 
coset space UStpx and the mapping St,x 1 Kx is compatible and neither constant 
nor injective, hence G cannot be primitive. 


Lemma 3.5.4. Let G be a primitive permutation group acting on a set S. Then any non- 
trivial normal subgroup N acts transitively and G = St,.N for the stabilizer St, of any 
point p of S. 


Proof. Consider the orbits under the action of N. For any p € S, g € G, we have 
pNg = peN, and this equals pN if pg = p and otherwise is disjoint. By primitivity 
it follows that pN = S. Now fix p € S and take any g € G; then pg = pu for some 
u € N, hence gu € Stp and so g € Stp.N as claimed. «| 


This leads to a criterion for simplicity: 


120 Further group theory 


Proposition 3.5.5. Let G be a primitive permutation group acting on S. If for some 
p €S the stabilizer St, contains an abelian subgroup A normal in Stp such that G 1s 
generated by all the conjugates g~ 'Ag (g € G), then any non-trivial normal subgroup 
of G contains the derived group G’. In particular, if G is perfect, then it must be simple. 


Proof. Let N #1 be a normal subgroup of G. By Lemma 3.5.4, N is transitive 
and G = Stp.N, where p € S. We claim that AN is normal in G. Let a€ A, n€ N; 
any g€G has the form bm, where be€St,, meN, and (bm)~'anbm = 
m~'b-lanbm = m7'a,b-'nbm = majnjm =a,\m,;n;m for some a, € A, my, 
n,; € N. Hence AN is normal in G and so contains all the conjugates g~ 'Ag. 
It follows that AN = G; now G/N & A/(ANMN) is abelian, therefore N D G’ and the 
conclusion follows. + | 


We shall use these results to prove that PSL,,(k) is simple, taking the action of 
PSL,,(k) on P"> !(k). 


Theorem 3.5.6. Let k be a field and n > 2. Then PSL,,(k) is simple except when n = 2 
and |k| < 3. 


Proof. As we have seen, the action of G = PSL, (k) on P” '(k) is primitive; more- 
over, G is perfect, for if H =SL,(k) and Z denotes the centre of H, then 
G = H/Z, hence by Proposition 3.5.2, G’ = H'Z/Z = H/Z = G. To complete the 
proof we shall verify the conditions of Proposition 3.5.5. Let p be the point with 
coordinates (1,0..... 0). This is left fixed by any matrix 


with inverse : 
b A Atha) ad 


where a € k”, be" 'k. AE GL, _ | (k). (3.5.4) 


Consider the subgroup L consisting of all matrices 


1 0 a 
( ) wherec ©" ¢k. (3.5.5) 
col 


Clearly this is an abelian subgroup, and transforming by (3.5.4), we find 


( a! 0 \( i)(; ‘) ( l 4 
=A “bas A c I)/\b A} \At'ca TS) 
Thus L is a normal abelian subgroup of Stp. Clearly every elementary matrix is 
conjugate to a matrix in L, so by Theorem 3.5.1, L and its conjugates generate G. 


Thus G satisfies the conditions of Proposition 3.5.5, and it is perfect, therefore it 
is simple. g 


Let us determine the order of PSL,,(k) when k is finite, with q elements say. We 
begin with GL,(k); this group acts on V = k", a vector space with q” elements. 
Any element of GL,(k) is completely determined by the image of the standard 
basis €;,@,.... e, of V, and this image can be any basis of V. Thus e; can map to 


3.6 The symplectic group 121 


any non-zero vector, giving q" — | choices, e. can map to any vector independent of 
the image of e,, giving q" — q choices, e; can map to any vector not a linear combi- 
nation of the images of e), es, giving gq” — q- choices, and so on. In this way we find 
that 


|GL,.(k)| = (q" — 1)(q" — q)..-(q" —q"~'), where |k| = q. 
To determine |SL,,(k)| we use the exact sequence (3.5.1) and find 
ISL,,(k)| = (q" oe 1)(q" ee q) 2 (q" es 7 ms a 1 


To find the order of PSL,,(k) we need to calculate the order of its centre Z. We recall 
that Z consists of all matrices cI such that c” = 1, so we need to find the number of 
solutions of c’ = 1 in k. But k is a cyclic group of order q — 1, so the number we 
want is d = (n,q—1). Thus 


|PSL,,(k)| = (q" — 1)(q" — q)..-(g" — q" =\q" '/d. where d = (n, g—1). 


Exercises 


1. By examining the action of PSL3(k) on P'(k) show that PSL:(F:) = Sym, 
PSL.(F3) & Alt;. Show also that GL2(F,) & Syms. 

. Show that PSL;(F;) = PSL2(F;) & Alts. 

. Show that PSL, (F,) — Alt,, PSL, (F;) = Alt,. 

. Apply Proposition 3.5.5 to show that Alt; is simple. 

How much remains true of Theorem 3.5.1 when commutativity 1s dropped? 

. Show that PSL3(F,) and PSL,(F.) have the same order but are not isomorphic. 
(Hint. Compare the Sylow 2-subgroups. ) 


3.6 The symplectic group 


Another class of simple groups is formed by the symplectic groups. We recall that a 
symplectic space is a vector space V over a field k with a regular bilinear form b which 
is alternating, 1.e. b(x, x) = 0, and hence b(x, y) = —b(y, x). Thus every vector is iso- 
tropic. Relative to a basis this form is described by an alternating matrix (also called 
skew-symmetric): A! = —A. Since the form is regular, its matrix A is non-singular, 
and it follows that a symplectic space is always even-dimensional. The symplectic 
group of V, Sp(V) or Sp>,,(k), where 2:1 = dim V, is the group of all symplectic 
transformations, i.e. all isometries of V. Any symplectic space V of dimension 2m 
has a basis of the form ),...,U,, V),--.,V¥m Where b(u;,v;) = 6; b(u,,u,) = 
b(v;, vj) = 0. A basis of this form is called a symplectic basis. Relative to this basis the 
matrix of the form b becomes 

e 2 (3.6.1) 

aa a ) -_ 


122 Further group theory 


Now the symplectic transformations may be described by the matrices P such that 
Pip? Sy, (3.6.2) 


To establish the existence of a symplectic basis we recall a result from BA. By a hyper- 
bolic pair of vectors we understand a pair u.v € V such that b(u, v) = 1; clearly a 
two-dimensional symplectic space has a basis consisting of a hyperbolic pair; such 
a space is called a hyperbolic plane. We recall that a subspace N is called totally iso- 
tropic if the form restricted to N is identically zero. 


Lemma 3.6.1. Let V be a symplectic space, U any subspace and Uy a maximal totally 


isotropic subspace of U. Then dim U < 2dim Up and any basis u,,....u, of Uy can 
after renumbering form part of a basis u,,.... Ue Vins an v. of U such that the 
ee ie eae eee $< r) are mutually orthogonal hyperbolic pairs. Moreover, this 


basis of U can be completed to a symplectic basis of V. 


Proof. Given U and Ly as stated, if Uy 4 U, then no vector of Uy is orthogonal to 
all of U,, so there is v;) € U such that b(u,. 1; ) = 0 for all 7 except one, which may be 
taken as | by renumbering the u’s; thus b(u), v;) # 0 and replacing v, by ¥)/b(i). )) 
we have b(u,.v,) = 1. If (Uy. ¥,) 4 U, we can repeat the process and after a finite 
number of steps we reach a basis of U of the required form. By the maximality of 
U,, we have s < r, hence dim U = r+s < 2r. 

Now if s < r, we can in V find v, such that b(u,. v,) = 6; (because V is regular); 
continuing in this way we find m,.....u,,).....¥;, a symplectic basis for a sub- 
space Woof V. If W 4 V, we can have an orthogonal sum V = WLW-°; by induction 
on dim V we can find a symplectic basis for W* which together with the basis found 
for W forms a symplectic basis for V. o 


For any u #0 in V and any c € k the linear mapping defined by 
T,, .Xtex+ cb(x. u)u (3.6.3) 


is a transvection along u, which is easily verified to be symplectic; its properties are 
given by 


Lemma 3.6.2. Let V be a symplectic space of dimension 2m and u any non-zero vector 
in V. Then any transvection which is symplectic, with kernel u~, has the form t,,,. for 
some c € k. Moreover, 


(i) for fixed u the mapping ci>t,,, is an injective homomorphism of the additive 
group of k into Sp»,,,(k), 

(ii) for any o © Sp,,,(k), 07 !ty.0 = Ty, anid 

UL) FORGE A. gn Te 

Proof. Any linear functional on V with kernel u* is a multiple of b(—. 1), hence the 

symplectic transformation with kernel u+ has the form t,,, for some c € k. The other 

properties are verified without difficulty. + | 


We shall use transvections to find a generating set for Sp>,,(k): 


3.6 The symplectic group 123 


Theorem 3.6.3. For any field k, and any m > 1, Spo,,(k) 1s transitive on the hyperbolic 
pairs, and is generated by the set of all symplectic transvections. 


Proof. We denote by T the subgroup generated by all symplectic transvections and 
divide the proof into three parts: 


(1) 


(11) 


T is transitive on the non-zero vectors of V. For if wu), #2 #0 and b(u), 1m) £0, 
let c € k be such that cbh(u,. uv.) = 1 and put u = um, — uw. Then t,,- maps u, to 
uy) + cb(uy. uy — ur)(u) — we) = uy — (4, — 2) = ur. If bluy.u2)=0 and 
v€ V is such that b(u,.v) #0 (¢= 1,2), then the result just proved yields 
transvections to map u, to v and v to uu, so it only so to find v. If 
H;, u» are linearly dependent, we can take any v not in u;. Otherwise 1. u2 

are linearly independent and such that b(u,. u») = 0; then we can by Lemma 
3.6.1, find vj. such that (14;,+)) and (u3.v2) are orthogonal hyperbolic 
pairs, and now v = v; + 2 is the required vector. Thus T has been shown to 
be transitive on the non-zero vectors of V. 

Next we show that T is transitive on the hyperbolic pairs in V. Let (14, ¥;) 
(i= 1,2) be two such pairs. By what has been proved we may take 
WW, = uy = u. If b(v,, 1.) 4 0, then (as we saw in (1)) there is a symplectic trans- 
vection t along v;—1¥, with v;tT= 19; since b(u, Vv) — 12) = blu.) - 
b(u, v2) = 0, we have ut = u, so t maps (14, )) to (u, v2). If b(,. v2) = 0, we use 
the hyperbolic pair (u,v; + 1); since —b(v). ¥) + 4) = b(v, +4, ¥)) = 1, we can 
find symplectic transvections to map (u,,) to (u,v) +) and (u, 1, + 4) to 


Cre vo), 


(iii) We now use induction on #1 to show that T = Spo,,(k). Suppose that m = 1, 


and that u, v is a symplectic basis. Any linear transformation has the form 
ui>u = au-+t br, 
viov =cutady., 


and this will be symplectic iff b(u’, v’) = 1, 1.e. ad — be = 1. Thus Sp>(k) con- 
sists precisely of all matrices with ee 1. By Theorem 3.5.1 this group is 
generated by all elementary matrices, and these are easily seen to be trans- 
vections. Now assume that mm > 1, let o € Sp,,,(k) and take any hyperbolic 
pair (uw. 1’) in V. By (ii) there exists tr € T such that (u. v)o = (u.1)r, hence 
ot” ' leaves u, v fixed. Therefore it maps W = (u,1)* into itself and defines 
an isometry there. By the induction hypothesis or~ '|W = [[1', where 7; is 
a symplectic transvection on W. We can extend 1’ to a symplectic transvection 
t, on V by defining it as the identity on (u,v). Then o = [| 1,.t and this shows 
that o € T. Go 


Since a symplectic transvection clearly has determinant 1, we obtain 


Corollary 3.6.4. Every: symplectic transformation has determinant 1. Go 


Theorem 3.6.3 also allows us to determine the centre of Sp>,,(k): 


124 Further group theory 


Corollary 3.6.5. The centre of Sp,,(k) consists of the transformations I and —I. 


Proof. If t is a symplectic transvection along u, then xt — x is proportional to u, 
hence xt. — xo is proportional to uo, for any o € Sp,,,,(k). Now xa.t — xo is pro- 
portional to u and not always 0, so if o and tT commute, then uo must be propor- 
tional to u. For a given o this can happen for all u only if o is a scalar, say 
o = c.1. Now the condition (3.6.2) shows that c> = 1, hence c = +1. B 


We next determine the commutator structure of the symplectic group. 


Theorem 3.6.6. Sp>,,(k) is perfect except when m= 1 and |k| <3 or m=2 and 
=: 


Proof. In the proof of Theorem 3.6.3 we saw that Sp.(k) & SL:(k) and by Proposi- 
tion 3.5.2 this is perfect when |k| > 2, so we may assume that m > 2. Suppose first 
that |k| > 3; we shall show that every transvection 1,,, is a commutator. In k there 
exists c # 0 such that c # 1. Putb = (1—c°) 7a, d = —c7b; then b+ d = a, hence 


Ty = Ty.bTu.d- If o is any symplectic mapping such that wo = cu (Theorem 3.6.3), 
then 
om SN =O oy ~hO = Trg.-b = Ten b = Ty oh = Tad: 
Hence 
Tra = Ty dTy bh = om OT as (3.6.4) 


and this is the required expression. 

There remains the case where k has two or three elements. It will be enough to find 
a transvection T,, 4 | in the derived group; since Sp>,,(k) is transitive on the non- 
zero vectors of V, we then have 07 !t,,.0 = Tyo. in the derived group, and in case 
PS 3h a= er so it follows that the derived group contains all transvections 
and so by Theorem 3.6.3, coincides with Sp>,,(k). We shall write our transformations 
as matrices relative to a symplectic basis. Let A, B € k,,; if A is invertible and B is 


symmetric, then 
. A 0 ) ap ( | ") 
4, = é an = 


are symplectic, as is easily verified. With this notation we have 
(S\.Rg)= Re. whereC =B~A™'BiA')"!. (3.6.5) 


and for suitable choice of A, B this is a symplectic transvection. 


_ 


1 | ie | 
Suppose first that |k{ = 3, #1 = 2. Taking A = ( | i ( ») we find 
that B~ A~'B(A’)~' = 2E,,, so we have obtained a transvection which is a com- 
mutator. The same argument applies for m > 2, taking A as I and B as 0 on the 
remaining coordinates. 


3.6 The symplectic group 125 


There remains the case |k| = 2, m > 3. When m = 3, we take 


sly “Oi A O 1 QO 
A={]0 0 1 BeSpl ib J 
1 0 O 1 1 
It is easily verified that B— A~'B(A')~ ' = E),, so we have again a symplectic trans- 


formation which is a commutator; for m > 3 we again take A, B to be I, 0 respec- 
tively on the new coordinates. So we have in all cases expressed a transvection as 
a commutator, and the result follows. Pe 


Finally we come to the simplicity proof, which runs along similar lines to that for 
the general linear group (see Section 3.5). 


Theorem 3.6.7. The projective symplectic group PSp2,,(k) is simple for all fields k and 
all integers m > 1, except when m= 1 and |k| <3 or m= 2 and |k| =2. 


Proof. Since Sp2(k) & SL3(k), we may assume that m > 1. We consider the action of 
G = PSp,,,,(k) on the space P = P-” ~'(k) and begin by showing that G is primitive. 
Let Q be a set in a partition of P compatible with the G-action and containing more 
than one point. Suppose first that Q contains a pair of points (x), (y) defined by 
a hyperbolic pair of vectors (x,y). Given any other point (z) in P, if b(x.z) 40, 
we may assume that b(x.z) = 1; since G is transitive on the hyperbolic pairs, by 
Theorem 3.6.3, there exists o € G mapping (x,y) to (x,z), hence Qo MQ #1, so 
Qa = Q, but (z) € Qo =Q, so Q=P. If b(x.z)=0, we may assume that 
(z) # (x), because otherwise (z) = (x) € Q. Then there exists we V such that 
b(x, w) = b(z,w) = 1, and by what has been shown, (z) € Q. Further there exists 
ao €G mapping (x, w) to (z,w), hence Qo NQF @, so again (z) € Qo = Q and 
it follows again that Q = P. The alternative is that the subspace defining Q is totally 
isotropic. Let (x,y) be a plane in Q and choose w€ V such that B(x. w) = 1, 
b(y, w) = 0. Writing H = (x, w), we have V = H1LH?'. Further, y € H~ and for 
042 €H"™ there exists a symplectic transformation o leaving H fixed and mapping 
y to z Since (x) €Q, it follows that Qo =Q and since (y) € Q, we have 
(z) € Qo = Q. Hence Q contains all points defined by vectors in H*. Since 
m > 1, H+ contains a hyperbolic pair, so the first part of the argument can be 
applied to show that Q = P, and this shows G to be primitive. 

In order to apply Proposition 3.5.5, we need to find a normal abelian subgroup of 
a stabilizer whose conjugates generate G. Take a point (x), let S be its stabilizer and 
let A be the group of transvections T,.,, (a € k). Then (3.6.4) shows that A a S and by 
Theorem 3.6.3, A and its conjugates generate G. Hence G is indeed simple, with the 
exceptions listed. Ee 


Again the exceptions actually occur (see Exercises 2 and 3). 
It still remains to compute the order of Sp>,,(k) when k is finite. 


126 Further group theory 


Proposition 3.6.8. If k is a field of q elements and m > 1, then 
Sp, = ar" (qr = 1).1Span - (A). (3.6.6) 
hence 
ISp,,,(K)] = @°" (g? — Dae" *(qr™ > — 1)... q(qr — 1). (3.6.7) 


For PSp,,,(k) the order 1s the same when q is even and has one half this value when q 
is odd. 


Proof. Let V be a two-dimensional space over k; we first determine the number of 
hyperbolic pairs in V. For a hyperbolic pair (x. y), x may be any non-zero vector 


in V, so there are q-’” — 1 choices. Given a particular hyperbolic pair (x, yy), any 
other hyperbolic pair with first vector x has the form (x.y), where y = jo + 2 for 
a vector zé€x*, and here we have qg’'”~' choices for z. Hence there are 


q’'"~ '(q°" — 1) pairs in all. By Theorem 3.6.3, Sp>,,(k) is transitive on the set of 
these pairs, and the stabilizer of a particular pair (x, )) 1s isomorphic to the sym- 
plectic group of (x,y)*, which is Sp>,,-2(k). Hence we obtain (3.6.6) and now 
(3.6.7) is an easy consequence. The final remark follows because the centre is +1, 
as we saw in Corollary 3.6.5 and —1 = 1 when gq is even. | 


Exercises 


1. Show that in the action of PSp;,,(k) on P-”~ '(k), the stabilizer of a point has 

exactly three orbits. 

. Verify that PSp.(F,) and PSp>(F;) are both soluble, of orders 6 and 12 respec- 

tively, and express them as permutation groups. 

3. Show that Sp,(F,) & Sym, by considering its action on quintuples of vectors 
1 oe us such that b(uj.uj;) = 1 for 14 j in a four-dimensional symplectic 
space over F,. (Hint. Find the number of hyperbolic pairs in a quintuple and 
the number of quintuples containing a given hyperbolic pair.) 


i) 


3.7 The orthogonal group 


The orthogonal group is the group of all orthogonal transformations of a quadratic 
space V, denoted by O(V ) or O,,(k), where k is the ground field and n the dimen- 
sion of the space. This group leads to a class of simple groups, just as the symplectic 
group does, but this time the result is dependent on the precise structure of the 
underlying quadratic space. Our treatment follows Iwasawa and Tamagawa, as in 
the account of Jacobson (1985). The field k is again assumed to have characteristic 
not equal to 2, and we shall denote the quadratic form by q and the associated 
bilinear form by b, so that b(x, x) = q(x). In what follows we shall assume that 


3.7 The orthogonal group 127 


dim V > 3; the case dim V = 1 or 2 is easily dealt with separately (see Exercises 4 
and 5). We begin by determining the centre of O(V). This turns out to be the 
same as that of the symplectic group, but for the proof we use symmetries instead 
of transvections. 


Proposition 3.7.1. Let V be a regular quadratic space of dimension > 3. Then the 
centre of O(V) ts +1. 


Proof. If u is any anisotropic vector in V and o,, : x! x — 2(b(x, u)/q(u))u the sym- 
metry defined by u, then (u) = {x € V|xa,, = —x}; hence any a@ in the centre of 
O(V ) has the property that u@ € (u) for every anisotropic vector 4. Let u),.... Un 
be an orthogonal basis of V; then u,a = +uj;, so by suitable numbering we may 
assume that wav = u, fori=l....., s and uja = —u; for i> s. If u, + u; is aniso- 
tropic, we have (u, + uj)a = £(u, + u,) and it follows that either 1,7 < s or i,j > s. 
Thus if I<s<n, then wu, +u, is isotropic, hence tor 1<i< 4, 
q(u; +u,+u,) = q(uj;) #0, so (u, +u,+u,)a@ = E(u, +u,+4u,,), where ¢ = +1, 
but also (4, + 4; + u,)a = u, + €’u; — u, and this leads to a contradiction. There- 
fore sis 0 or nm and a= 1 or —1. ee 


The symmetry o,, is defined only for anisotropic vectors u; in the isotropic case 
one has the following replacement, going back to Carl Ludwig Siegel. 

Let u be an isotropic vector, choose 1 so that u, v is a hyperbolic pair and take 
O Awe (u,v). We have V = (v) @ ut, hence the equations 


XPyw =X+ (x, w)u | | 
(x € un) (3.7.1) 
Vig = VOW) = Ww, 


define the linear transformation p,,,,, completely and it is easily checked that 9,,.,, 
is orthogonal. Moreover, it is proper (1.e. of determinant 1); 1n fact it is unipotent, 
i.e. | — fy. 18 nilpotent. 

We remark that p,,, is uniquely determined as the orthogonal transformation 
which maps x to x + b(x, w)u for all x € u-. For this mapping can be extended to 
an orthogonal transformation, by Witt’s theorem (BA, Theorem 8.5.5) and if 
there were two such mappings, their quotient would leave u- fixed. Now va@ has 
the form Au + pv + z, where z € (u, v)+. For any x € (u.v)* we have 0 = b(x, v) = 
b(xa, va) = b(x, Au + pv +z) = b(x,z); hence z € (u, v)*+ = 0, ie. z = 0. Further, 
l= b(u.v) = bl ua. va) = b(u.Aut+ wv) = wv, thus p =1 and finally 0 = q(v) = 
g(va) = q(Au+v) =A, hence 4 = 0. So va = v and therefore a = 1. This shows 
Py. to be uniquely determined by its effect on u-. 

We shall use the transformation ,,,, to construct an abelian normal subgroup of 
the stabilizer of u. 


Lemma 3.7.2. Let V be a regular quadratic space of dimension > 3, containing a hyper- 
bolic pair u, v and put W = (u,v). Then the set 


Ay, = {PuwlW € W} 


128 Further group theory 


is a subgroup of O(V) which is abelian and normal in the stabilizer of u. Moreover, the 
mapping 
WI Pre (3.7.2) 


is an isomorphism of W with A,,. 


Proof. Given w, w’ € W, we see from (3.7.1) that Oy yOu and Pyw+. both map x 
tox + b(x. w + w’)u, for any x € u*, hence they agree on the whole of V. This shows 
(3.7.2) to be a homomorphism. If w lies in the kernel of (3.7.2), then by (3.7.1), 
we utt = (u) (see BA, Proposition 8.1.3), but the only multiple of u in W is 0; 
this shows (3.7.2) to be injective. It follows that A, is an abelian subgroup of 
O(V). Clearly it is contained in the stabilizer of u. Moreover, for any ao € O(V) 
we have 0! PywO = Puc.w, hence A, is mapped into itself by any o leaving 


fixed. | 


Our next aim will be to show that under some mild restrictions on V, the derived 
group O(V )’ is generated by all the A. That some restriction is needed is clear since 
there are no A unless the Witt index of V is positive. It will be convenient to write Q 
for the subgroup of O(V ) generated by all the A, where u ranges over all isotropic 
vectors. We begin by establishing some transitivity properties. 


Lemma 3.7.3. Let V be a regular quadratic space, dim V > 3. Then 


(i) for any isotropic u € V, A,, is transitive on the one-dimensional isotropic subspaces 
not orthogonal to u, 

(11) for any two linearly independent isotropic vectors u,, 2 there is a vector v and 
A}, A2 € k such that (A,u,, v) is a hyperbolic pair, 

(ili) the subgroup Q of O(V ) generated by the A,, is transitive on the hyperbolic pairs. 


Proof. (i) Let z, y be isotropic vectors not orthogonal to u; we may assume that 
b(u.x) = blu, y) = 1. Write y = Au + x +z, where z € (u,x)~; since b(u, y) = 1, 
we have « =1 and the relation q(y)=0 shows that A+ q(z)=0. Hence 
y=x-q(zjut+e2= xp, and (i) follows. 

(ii) Let 4,12 be as stated; if b(1), 12) 40, we may assume that b(u).u2) = 1. 
The space (1, 142)" contains an anisotropic vector w, by the regularity of V. We 
put v= uy, — q(w)m2 +; then g(v) = —q(w) + q(w) = 0, blu.) = —q(w) £0, 
b(t2,v) = 1, so with Ay = —1/q(w), A> = 1, v satisfies the required conditions. If 
b(u,, U2) = 0, then since u;, u2 are linearly independent, there is a linear functional 
equal to 1 on 1, u2, hence by the regularity of b there exists v such that b(u;, v) = 1, 
and by subtracting suitable multiples of 1, from v we can ensure that q(1') = 0. Then 
(u;. ¥) are again two hyperbolic pairs. 

(ii) Let (u;.v) (t = 1,2) be any two hyperbolic pairs. If 1), 2 are linearly inde- 
pendent, we can by (ii) find v and A; such that (A,u;, v) are hyperbolic pairs; by 
(ii) there is then an element of A, mapping (u)) to (m2). Thus for some a € Q, 
4;0 = C2. We now have the hyperbolic pairs (14,0, v;)o) = (cuo, vjo) and (u2, v2); 
applying (i) again, we find te A,, such that vot = v2, WoT = 2, SO OT Maps 
(uj). ¥)) to (cu2,1>), but 1 = blu, v,) = b(cu2, v2) = c, hence c= 1. If u,;,u> are 


3.7 The orthogonal group 129 


linearly dependent, we can by (i) apply an element of Q to (uw), ¥;) to obtain a 
hyperbolic pair with first vector linearly independent of u. and now proceed as 


before. a 


We recall the Cartan—Dieudonné theorem, which shows that O(V) is generated 
by symmetries (see BA, Corollary 8.3.3). It follows that the square of every ortho- 
gonal transformation lies in the derived group O(V). For if a=o,...a,, then 
a =0)...0,0)...0, =0;...07=1 (mod O(V)'). We also note that for 
dim V >3 we have O(V)’ = SO(V)’ (BA, Theorem 8.3.4). In two dimensions 
this need not hold; as an example (which is used later) let us determine O(V ) for 
a hyperbolic plane. 

Let H be a hyperbolic plane with the hyperbolic pair u, v as basis. Any linear 
mapping has the form 


u=autby, v =cu+ dy, 


and this is an isometry iff g(u’) = q(v’) =0, b(u'.v') = 1. Thus ab = cd = 0, 
ad + be = 1, so either b=c=0 or a=d =O, and the isometries of H have the 
following forms: 


a 0 0 a 0 | 
pn ( , | or At = . where t= (3.7.3) 
0 a” ay! 0 1 0 


The 4,, form a group isomorphic to k* and O(#) is an extension by a cyclic group of 
order 2, acting by inversion: tA,t = A, |. In particular, 


Ag: = (T. Ag). (3.7.4) 


We can now determine the subgroup Q: 


Theorem 3.7.4. Let V be a regular quadratic space of dimension > 3 and positive 
Witt index n. The SO(V) = O(V)'; more precisely, the subgroup Q generated by the 
A, (1 isotropic) coincides with the derived group: Q = Q'=O(V)’, except when 
he 4, Ve 2 atid n= 3, lk) = 3. 

Proof. As a first step we show that 2 > O(V)’. We fix a hyperbolic pair (u, v) and 
write O,; = O((u, v)), W = (un, vy, We claim that every symmetry is conjugate 
under &2 to one in Qj. For if x is any anisotropic vector, then x; = u + q(x)v satisfies 
q(x,) = q(x), hence there exists op € O(V) such that x) = x. By Lemma 3.7.3(ii1) 
there exists @€& mapping (up,vp) to (u.v); since x, € (u,v), we have 
x=x)p€(up.vp) and so xa=x,pa € (u,v). Hence a 'o,.a@ = odyq, where 
xa € (u.v), as we wished to show. 

Now O; = O((u, v)) is generated by symmetries o,(x € (wu, v)), and the restriction 
o,|W is the identity; it follows that the mapping o, !—o,|(u.v) defines an iso- 
morphism O, = O(#7), and clearly SO, corresponds to SO(H) in this isomorphism. 
Now any p € SO(H) can be written as a product of an even number of symmetries: 


Pp =0O),...0,, 


130 Further group theory 


and we have seen that o, = v, 't;v,, where t, is a symmetry in O; and v; € 2. Thus 
we have 


- - ] 
p= (v, TDs, T>,Vay). 


Since Q is normal in O(V ), this relation can be written p = pT)... T,, where uw € Q, 
so we have SO(V) C Q.SO,. But as we saw, 2 C SO(V), and we conclude that 
SO(V) = 2.SO,. (3.7.5) 


Therefore SO(V )/Q = SO,;/SO,; AQ, but SO, =k* is abelian, by the earlier 
remark, so 822 > SO(V)’ = O(V)’ and we find that 


Q>SO(V). (3.7.6) 


To complete the proof we show that Q is perfect, for then Q = Q' = SO(V)’. It will 
be enough to show that p,,,,, € Q’ for all isotropic u and all w orthogonal to a hyper- 
bolic plane containing u. So we may take u, v, W, O, as before. Let A, t be the trans- 
formations in O, defined as in (3.7.3), so that A,» € 9 by (3.7.4) and (3.7.6). For any 
w € W we have 


-]- ] _ = ak 
Ky: Py wAa- Puw = Paiu.- wPuw = Pu.- atwPuw = Mi = a }ue 


When |k{ > 3, we can choose aék’ such that a #1 and replacing w by 
(1 —a*) 'w we see that p,., € 2’, hence Q = Q'’ = O(V)’ when |k] > 3. 
Suppose now that |k| = 3; then n > 4 and for n = 4, v= 1. Thus dim W > 2 and 
we have an orthogonal basis w,; = w, w2...., w, for W. If qg(w,) = q(w2), then 
the map @:w).w...., Wy I> — WoW). W5, eee, w, is an isometry, hence 


a € OV)’ CQ and a? maps w to —w. Thus 


—-],-2 eae o _ 
Pi wO Puw = Pu.-wPu.-w = Puo- lw = Puy 


and this shows that p,,. € Q’. 

It remains to justify the assumption q(w)) = q(w2). Since q(w,) € k* and |k| = 3, 
q{w,) is 1 or —1. For n = 4, W is isotropic, so q(w)), q(w>) have the same sign and 
hence must be equal. When n > 5, WM w> is regular and at least two-dimensional, 
so q restricted to this subspace is universal (see BA, Theorem 8.2.7), and it follows 
that W contains w’ orthogonal to w such that q(w’) = q(w). Thus we can in all 
cases find a basis of the required form. ie 


We now have the means at our disposal to prove the main structure theorem for 
orthogonal groups. The result was first established, with a restriction on the index, by 
Dickson (1901), and in its full generality by Dieudonné in 1940. 


Theorem 3.7.5. Let V be a regular quadratic space of dimension n > 3 and of positive 
Witt index v. Denote by C the centre of SO(V). Then SO(V)/C is simple, except when 
=A, =? and n= 3,|\k|=3. 


Proof. We remark that C has order 2 when n is even and is trivial when n is odd. For 
by Proposition 3.7.1 the centre can only contain 1 and —1 and det( — 1) = (— 1)". 


3.7 The orthogonal group 131 


Consider the quadric cone Q defined as the set of points (x) in P”~ '(k) satisfying 
g(x) = 0. By Lemma 3.7.3(1), £2 acts transitively on Q. We first show that the action 
is primitive except when n = 4, v = 2. If b(x. y) 40 for any two linearly indepen- 
dent isotropic vectors x, y, then by Lemma 3.7.3(i1), Q 1s 2-fold transitive, and 
hence primitive. In particular, this always holds for vy = 1, so we may assume hence- 
forth that v > 2 and hence » > 5. Let S be a subset with more than one point in a 
partition of Q stable under 82; we have to show that S = Q. Given (x;), (x2) € S, if 
b(x,,x2.) = 0, then there exist y,. 2 € Q such that (x). y)). (42, y2) are orthogonal 
hyperbolic pairs. By Lemma 3.7.3(i) applied to (x». yo) there exists a € Q mapping 
x, to y; and leaving x2, > fixed. Since (x.) € S, we have Sa = S and (x) € S, there- 
fore (y,) € S. Now if (z) is any point of Q different from (x,), we can by Lemma 
3.7.3(ii) find v such that (x, v), (2.1) are hyperbolic pairs. By Lemma 3.7.3(iil) 
there exists 6 € Q mapping (x).y)) to (x,,v), hence SB =S and (v) € SB=S. 
Similarly there is y € 2 mapping (x), v) to (z, v), hence (v) € Sy = S and z= xy, 
so (z) € S. Since z was arbitrary, this means that S = Q. 

We may therefore assume that for any distinct points (x,), (x2) in S, b(x,, x2) # 0. 
Given (z) 4 (x;) in Q as before, we can find v such that (x). v), (z, v) are hyperbolic 
pairs, and B € &2 maps (x,.x2) to (x;. 1). It follows that $6 = S and (v) € S. If now 
y € 2 is chosen so as to map (x), 1’) to (z. v), then since vy = v, we have Sy = S and 
z= x,y, hence (z) € S and we again find that S$ = Q. 

We thus see that Q acts primitively on Q and Q is perfect, by Theorem 3.7.4. 
Moreover, if u is isotropic, then A, is a normal abelian subgroup of the stabilizer 
of (u) and the conjugates of A,, generate 22, by the proof of Theorem 3.7.4. Hence 
by Proposition 3.5.5, Q is simple and now the result follows by Theorem 3.7.4. 


Some of the exceptions of Theorem 3.7.5 will be considered in the exercises. Let us 
now take up some particular cases to show that the hypotheses on the Witt index 
cannot be omitted, without striving for full generality. We take a quadratic form 
over R; if its index is 0, the form must be definite, say positive definite, and in 
suitable coordinates it will have an orthonormal basis. For simplicity consider the 
case n = 3, thus SO(V) is the group of rotations in 3-space. We claim that 
SO(V ) acts primitively on the unit sphere S. For let T be a subset of S with more 
than one point, stable under all rotations. Given p,q e¢7,T must include all 
points of the circle through q about p as axis. If the points at opposite ends of a 
diameter of this circle are a spherical distance d apart, then T will include points 
at any distance < d from q, and by repetition, points at any finite (spherical) dis- 
tance, hence T = S and the action is primitive. The rotations about a point form 
an abelian subgroup whose conjugacy classes generate SO(V), and this shows 
SO(V) to be simple. The same argument applies for any odd dimension > 3, 
while for even dimensions > 6, PSO(V) is simple. When dim V = 4, we have 
PO = G x G, where G = PSL.(k) when V has index 2, and G = SO(R°*) for a Eucli- 
dean 3-space when V = R? is Euclidean. When R®* has index | (e.g. the Lorentz 
metric of relativity theory), then PO = PSL3(C); in this case Q consists of all rota- 
tions which do not reverse the time direction. 

The argument just used to show that for a definite quadratic form PSO(R") is 
simple depended essentially on the fact that R is Archimedean ordered. For an 


132 Further group theory 


ordered field K which is non-Archimedean (i.e. there are elements greater than any 
integer) it can be shown that PSO(K") is not simple: the infinitesimal rotations 
generate a proper normal subgroup (see Exercise 6). This happens, for example, 
for the field of formal Laurent series R((x)), ordered by the sign of its first coefficient. 

For a finite field it is again possible to calculate the order of the orthogonal group, 
but this depends on the quadratic character of the determinant as well as the parity 
of the dimension (see Exercises 7-9). 


Exercises 


1. Give the details of the proof that for a real Euclidean space V of dimension > 5, 
PSO(V ) is simple. 

. Verify that Py. defined by (3.7.1) satisfies (1 — Puw) = 0. When is 
(1 = Paw)” = 0? 

3. Let V be an n-dimensional quadratic space. Given two anisotropic vectors x, y, 
show that o,0,. is a rotation in the plane (x, y) leaving (x, y)*> fixed. Verify that 
for a Euclidean V the angle of rotation is twice the angle between x and y. 

. Use the method of proof of Proposition 3.7.1 to find the centre of O;(k), O2(k). 

. Examine the form Lemmas 3.7.2 and 3.7.3 take when dim V = 1 or 2. 

6. Let V be a Euclidean space over an ordered field K which is non-Archimedean. 
Show that the rotation through an infinitesimal angle generates a proper normal 
subgroup (qa is infinitesimal if na < | for all n € Z). 

7. Show that over a finite field of odd characteristic every regular quadratic form 
has the form (1"~'.d) where d is the determinant and that |k*/k*"| = 2, so 
that there are just two classes of forms in each dimension. (Hint. Recall from 
BA, Section 8.2 that a quadratic form of rank > 2 over a finite field is universal.) 

. Let |k| = q be odd. Show that the number of solutions of 5°)" (xj — 7) = b 
is g?"~!—qg™-! + 3o,q™, the number of solutions of S°) (x7 —y?7)— 
(d—l)y, =b is g'"~!+q™"~! —8oyq", and the number of solutions of 

D(x? —y7) —2° =b is gq?" +(—b/q)q”", where (—b/q) is a Legendre 

anbol ie. 0, 1 or —1 according as —b is 0, a non-zero square or not a 


tO 


‘Jt fe 


GO 


square In k. 
9. Show that for a regular form over a finite field F, (q odd), |O2,,(F,)| = 
Cm — ges NOoam— 14 F,,)|, [Orm+1(F,) |= = (qr a — d/q)q” Om (Fy)|, 


where d is the determinant of the form. Hence calculate the order of the ortho- 
gonal group. 

10. What form do the equations (3.7.1) take when u is replaced by —u? Show that 
O- u.-w = Puw and that the normalizer of A, includes any o mapping u to —uw. 


Further exercises on Chapter 3 


1, Let A be a group and @ an automorphism of A. Show that there is a split exten- 
sion E of A by an infinite cyclic group such that @ is induced by an inner auto- 
morphism of E. Show also that any extension of A by an infinite cyclic group 
splits. 


3.7 The orthogonal group 133 


11. 


. Defining an automorphism of an extension E as an isomorphism of E with itself, 


show that the group of all automorphisms of an extension E of an abelian group 
A by a group G is isomorphic to the group of 1-cocycles of G in A. Show that if 
the extension is split and H'(G, A) = 0, then any two complements of A in E are 
conjugate. 


. Let F be the free group on X and kF the group algebra over a field k. Obtain the 


following analogue of (2.7.10): 
0 —> *(kF @, KF) > kF @, kF > kF > 0. 


By applying k @;¢ deduce that the augmentation ideal IF is free on x — 1 (x € X) 
as F-module. Hence show that H"(F,A) = H,(F.A) = 0 for n > 1. 


. Show that in a finite soluble group G, with a Hall subgroup of order r, the 


number h, of subgroups of order r in G has the form h, = c,...¢,, where 
each c; is a prime power dividing the order of a chief factor of G and c; = 1 
modulo a prime factor of r. (Hint. Put |G| = rs, where (r,s) = 1 and first 
treat the case when G has a normal subgroup of index r’s’, where r’|r, s‘|s 
and s’ > 1.) 


. Let E/k be a finite Galois extension of degree m = q,...q,-, where the q; are 


powers of distinct primes. Show that if E contains a subfield E, of degree q; 
over k, fori =1,...,7r, then 


B= fF) On 2Ob Ey ek) =¢; 


Use Hall’s theorem to show that such a decomposition of E exists iff 
G = Gal(E/k) is soluble and that any two such decompositions of E are conju- 
gate by an element of G. 


. Show that for any finite Galois extension E/k, a decomposition as in Exercise 5, 


where the £;/k are all Galois extensions, exists iff Gal(E/k) is nilpotent. 


. Show that the order of a finite simple group is divisible either by 12 or by the 


cube of the least prime dividing its order. (Hint. If a Sylow p-subgroup P has 
order p or p’, it is abelian; now use Theorem 3.3.2 to describe the action of 
Nc(P) on P.) 


. Show that PSL;(F;) = PSL>(F-) and find its order. (Hint. This is the subgroup 


of Sym- in the action on the columns of the array 


}) 2S 4-5 6 7 
2: 2 4 2s G7 ll 
45 6 7 1 2 3 


which map each column into another. The columns may be interpreted as lines 
in the projective plane over F;, P~(F»).) 


. Describe O(V )"” for a two-dimensional anisotropic space V. 
. Show that in the action of PSp>,,(k) on P?”"~ '(k), the stabilizer of a point has 


exactly three orbits. 

Let X be a subset of a free group F and define an elementary transformation of X 
as one of the following: (i) replace x by xy (x, y € X,x # y), (ii) replace x by x 7! 
(x € X), (ili) omit |. A Nielsen transformation is a series of elementary trans- 


134 


I2. 


LS: 


14. 


15. 


Further group theory 


formations. Show that if X’ is obtained from X by a Nielsen transformation, then 
gp{X'} = gp{X}. Show also that any finite subset can be reduced by a Nielsen 
transformation, i.e. brought to the form where 1 ¢ X.x,yEXUX 7! xy 41 
implies |xy| > |x|, [vy] and xy.zEXUX 7 !.xyA#1l.yz4#1 implies 
|xyz| > |x| — |y| + |z|, where |wj| is the length of w in terms of a basis. 

(H. Zieschang) Let U be a reduced set in a free group F (see Exercise 11). For 
each u€ UUU~! denote by a(u) the longest prefix of u cancelled in any pro- 
duct vu £1, e€ UUU ~!. Show that u = a(u)m(u)a(u~!)7! is reduced, where 
m(u) #1, andifw = u,...u,,, then in the reduced form of w, m(u)),.... m(u,,) 
are uncancelled. 

Show that a reduced subset X of a free group is a basis of gp{X}. Deduce that 
every finitely generated subgroup of a free group is free. 

Let F be a free group of finite rank on a basis X and let A be an automorphism of 
F. By applying Exercises 11 and 12 to the set X* show that every automorphism 
of F can be obtained by a permutation and a Nielsen transformation of the 
generating set. 

(M. Takahashi) Let F be a free group and F = F, > F: D... a series of sub- 
groups such that F,,, contains no element of a basis of F,. Show that relative 
to any basis of F; any element w #1 of F,., satisfies |w| > 1. Deduce that 
any infinite series of characteristic subgroups of F intersects in 1, and so 
obtain another proof of Magnus’s theorem (Corollary 3.4.6). 


Algebras 


In Section 5.2 of BA we saw that semisimple Artinian rings can be described quite 
explicitly as direct products of full matrix rings over skew fields (Wedderburn’s 
theorem). Later in Chapters 7 and 8 we shall see what can be said when the Artinian 
hypothesis is dropped, but in many cases, such as the study of group algebras in 
finite characteristic, it is important to find out more about the non-semisimple 
(but Artinian) case. There is now a substantial theory of such algebras which is still 
developing, and a full treatment is beyond the framework of this book, but some of 
the basic properties are described here. 

One of the main results, the Krull-Schmidt theorem, asserts the uniqueness of 
decompositions of a module as a direct sum of indecomposables. This is established 
in Section 4.1 for finitely generated modules over Artinian rings, but as we shall see 
in Section 4.3, for projective modules it holds over the somewhat larger class of semi- 
perfect rings. These rings are also of interest because they allow the construction of a 
projective cover for each finitely generated module (Section 4.2). 

The rest of the chapter is concerned with conditions for two rings to have equiva- 
lent module categories (Section 4.4), leading in Section 4.5 to Morita equivalence; 
Section 4.6 deals with flat modules and their relation to projective and injective 
modules, while Section 4.7 studies the homology of algebras and in particular separ- 
able algebras. 


4.1 The Krull-Schmidt theorem 


A finitely generated module over a general Artinian ring may not be semisimple, Le. 
a direct sum of simple modules, but it can always be written as a direct sum of inde- 
composable modules. Moreover, these indecomposable summands do not depend 
on the decomposition chosen, but are unique up to isomorphism. This is the content 
of the Krull—Schmidt theorem; our aim in this section is to prove this result, but we 
shall do so in a slightly more general form. 

We recall a local ring is a ring R in which the set of all non-units forms an ideal m; 
this is then the unique maximal ideal and it is easily seen that R/m is a skew field, 
called the residue class field of R. From the definition it is clear that a ring is local 
precisely if the sum of any two non-units is a non-unit or equivalently, for any 
non-unit c, 1 — ¢ is a unit. The maximal ideal of a local ring is its Jacobson radical; 


136 Algebras 


since the latter is nilpotent in any Artinian ring (BA, Theorem 5.3.5), it follows that 
an Artinian ring is local iff every element is either nilpotent or a unit. Such rings arise 
naturally as endomorphism rings of indecomposable modules, as we shall now see. 
In what follows we shall write our modules as left modules and put module homo- 
morphisms on the right, except when otherwise stated. 


Lemma 4.1.1. (Fitting’s lemma). Let R be any ring and M an R-module of finite com- 
position length. Given any endomorphism a of M, there exists a direct decomposition 


such that My, M, both admit a, and @ restricted to Mo is nilpotent, while its restriction 
to M, is an automorphism. 


Proof. We have M > Ma2> Mor >....0C kera Ckera’ C...; since M has 
finite length, there exists mn (at most equal to the length of M) such that 
Ma" = Ma"*!=..., kera" =kera"t!=.... We put M, = Ma", Mo = ker a” 
and then have M,a" = Ma?" = Ma" = M;. Thus for any x € M, xa" = x,@" for 
some x, € M;; hence x — x; € ker a" = Mo and so x € My +. M,. This proves that 
M =Mo+M,, and this sum is direct, for if ye MyM), then y = xa", hence 
0 = yo" = xa-". Thus x € ker a?" = kera@” and y = xa" = 0. This shows the sum 
to be direct and it establishes the decomposition (4.1.1); clearly @ is nilpotent on 
Mp and bijective on M,. | 


If in this lemma M is indecomposable, then My or M, is the zero module and we 
obtain 


Corollary 4.1.2. Let M be an indecomposable R-module of finite length. Then any 
endomorphism of M is either nilpotent or an automorphism, thus Endp(M) is a local 
ring. | 


A local ring in which every non-unit is nilpotent is said to be completely primary. 
Thus the endomorphism ring of an indecomposable module of finite length is com- 
pletely primary. This result does not hold without some restriction on the module M 
beyond being indecomposable (see Exercise 5), but we remark that conversely, any 
module with local endomorphism ring is indecomposable. For if M is decomposable, 
say M = M; @ M;, then the projection e; : M — M, is an idempotent of End(M ) 
which for a non-trivial decomposition is neither 0 nor 1. Hence | —e is not a 
unit and so End(M) cannot be local. Sometimes a module with local endomorphism 
ring is called ‘strongly indecomposable’. 

In what follows we shall state our conclusions for modules with local endo- 
morphism rings. By Corollary 4.1.2 they will apply to any indecomposable modules 
of finite length, in particular to any finitely generated indecomposable modules over 
Artinian rings, but they will also apply in some other cases. 


Lemma 4.1.3. Let R be any ring and V an indecomposable R-module such that 
Endg(V ) = E 1s a local ring with maximal ideal m. Given any R-module M and homo- 


4.1 The Krull-Schmidt theorem 137 


morphisms a: V > M, B: M — V, we have aB € m unless V is a direct summand 
of M. 


Proof. If wf ¢ m, it is a unit in E, so there exists y € E such that aBy = ya = 1. 
This shows that @ is injective and we have an exact sequence 


0— V — M —> cokera — 0. 


The sequence is split by By, hence M = V @ coker a. g 


In what follows we shall fix a left R-module V with local endomorphism ring 
E = Endg(V). The maximal ideal of E will be denoted by m and we put K = E/m 
and write x!'~([x] for the natural homomorphism E— K. Given any left 
R-module M, we can consider Homp(V, M) as left E-module in a natural way. We 
shall write [V, M] = Home(V, M)/mHomap(V, M); this is a left E-module which is 
annihilated by m, so it can be defined as a left vector space over K in a natural way. 
Similarly we can consider Homr(M.V) as a right E-module and hence define 
[M, V] = Home(M, V)/Homr(M, V)m as a right K-space. Next we define a 
bilinear mapping 


b:[V.M|]x[M,V]-K 


by the following rule: Given a € [V,M], B € [M, V], take homomorphisms f, g 
such that | f] = a, [g] = B and define 


b(a, B) = [fg]. (4.1.2) 


This is a well-defined operation on a, f, for if [f] =[f'], say f’ =f + do Ajhi, 
where A; € m, h; € Hom(V,M), then [f'g] = [fg] + >- Ai[hig] = [fg], hence [fg] 
depends only on [f ], not on f, and similarly for g. In this way (4.1.2) defines a pair- 
ing of the spaces [V.M], [M, V]. We define the rank of b in the usual way as the 
rank of the matrix obtained by taking bases. Thus if (u;) is a left K-basis of 
[V,M] and (v;) a right K-basis of [M. V], then the rank of b is given by the rank 
of the matrix (b(u;, v;)). Clearly this is independent of the choice of bases; we 
shall denote it by w\(M), thus 


Hy (M) = rk(b( uj, v;)). (4.1.3) 


Suppose now that M is expressed as a direct sum: M = }-; @M,. It is clear that this 
gives rise to a direct sum decomposition for both [V.M] and [M. V]: 


[V.M]}=O6IV,Mi]. [M.V]=6[M;, V]. 


Here a homomorphism f : V > M; corresponds to a homomorphism V ~ M 
which is obtained by combining f with the canonical injection M; — M. Similarly 
a homomorphism g : M; — V corresponds to a homomorphism M — V obtained 
by following the canonical projection M — M,; by g. It follows that the elements of 
[V,M].[M,V] are s x s matrices with the members of [V, M;][M;, V] along the 
main diagonal and zeros elsewhere. By Lemma 4.1.3, [V,M;][M;, V] can be non- 
zero only if M; has V as a direct summand; in particular it will be zero whenever 


138 Algebras 


M,; is indecomposable and not isomorphic to V. All these arguments still apply when 
M, is an infinite direct sum, and they allow us to draw the following conclusion: 


Proposition 4.1.4. Let R be a ring and V an R-module with local endomorphism ring. 
Given an R-module M which 1s expressed as a direct sum of indecomposable modules, 
the multiplicity of V in this direct sum is independent of the decomposition chosen for M 
and 1s equal to u\(M). 


Proof. Let the given decomposition of M be 
M=@0®M,. (4.1.4) 


and write 4; (M) = u(M) for short. Then [V, M][M, V] is a direct sum of terms 
[V.M,;][M,. V] , by the above remarks, and since M, is indecomposable, we have 


K if M, 2 V, 


[V,M;][M;.V] = | 
0 otherwise. 


It follows that the rank f4;(M) is equal to the number of terms in (4.1.4) that are 
isomorphic to V, i.e. the multiplicity of V in (4.1.4). | 


Corollary 4.1.5. Given any ring R and R-module M, let 
M=961M, = BN, 


be two direct decompositions of M into indecomposable modules. If the components in at 
least one of these decompositions have local endomorphism rings, then there is a bijection 
Aid from I to J such that M, = N,.. 


Proof. Suppose that Endp(M,) is local for all A. By Proposition 4.1.4, the multipli- 
city of each M, is the same in both decompositions and this allows us to construct 
the desired bijection. | | 


For Artinian rings this conclusion may be stated as follows: 


Theorem 4.1.6 (Krull-Schmidt theorem). Let R be any Artinian ring. Any finitely 
generated R-module M has a finite direct decomposition 


M=M,6...@M,, (4.1.5) 


where the terms M, are indecomposable, and this decomposition is unique up to 1so- 
morphism and the order of the terms; thus if alo M=N,@...@®N,, where each 
N, ts indecomposable, then s =r and there is a permutation 111° of 1...., r such 
that M; = N;.. 


Proof. Any finitely generated R-module over an Artinian ring R is Artinian (BA, 
Theorem 4.2.3) and so has finite length (BA, Theorem 5.3.9). Hence we can form 
a direct decomposition (4.1.5) with a maximum number of terms, and all terms 
are then indecomposable. By Corollary 4.1.2 each endomorphism ring Endpg(M) is 
local, so we can apply Corollary 4.1.5 to conclude that the decompositions have 
isomorphic terms, possibly after reordering. | 


4.2 The projective cover of a module 139 


Theorem 4.1.6 was stated for finite groups by Joseph H. M. Wedderburn in 1909, 
and first completely proved by Robert Remak in 1911. It was later extended to 
abelian groups with operators by Wolfgang Krull in 1928 and to general groups 
with operators by Otto Yu. Schmidt 1928. In 1950 Goro Azumaya noted that the 
finiteness condition could be replaced by the condition that the endomorphism 
rings of the components be local (Corollary 4.1.5). The above presentation is 
based on a lecture by Sandy Green. 


Exercises 


1. Show that a ring R with Jacobson radical J is local iff R/J is a skew field. 

2. Show that in any ring an idempotent with 1-sided inverse equals 1. Deduce that a 
ring R in which for each a € R either a or 1 — a has a |-sided inverse, must be 
local. (Here ‘a has a 1-sided inverse’ is taken to mean: either ax = 1 or xa = | 
has a solution.) 

3. Give an example of a non-Artinian non-local ring in which every element is either 
nilpotent or a unit. (Hint. Try the commutative case.) 

4. Show that a ring R is indecomposable as module over itself iff R contains no 
idempotent + 0, 1. 

5. Use Exercise 4 to give an example of an indecomposable module whose endo- 
morphism ring is not local. 

6. Let V be an indecomposable module which is not injective and let J be its injective 
hull. Show that [V. 1] 4 0 but [V,J].{/, V] = 0. 

7. Let M = M, ®...@M, and suppose that V is an indecomposable module which 
is a direct summand of M. Show that V is a direct summand of M; for some 
an SI hence 


4.2 The projective cover of a module 


We have seen in Section 2.3 that every module has an injective hull. Dually one can 
define the projective cover of a module, but this does not exist for all modules. Later, 
in Section 4.3, we shall meet general conditions for its existence, but for the moment 
we shall show that its existence is assured over Artinian rings. Throughout this 
section we shall limit ourselves to finitely generated modules; we recall that over 
an Artinian ring every finitely generated module has finite composition length 
(BA, Theorem 4.2.3 and Theorem 5.3.9). We begin with a couple of auxiliary 
remarks which hold for general rings. Here a maximal submodule is understood 
to be among all the proper submodules. 


Lemma 4.2.1. Let R be a ring with Jacobson radical J and let M be a finitely generated 
R-module. Then 


JM © OM,. (4.2.1) 


140 Algebras 


where M, ranges over all maximal submodules of M. Moreover, if R/J 1s semisimple, 
then equality holds in (4.2.1) and we have 


M/JM = ®S,,. (4.2.2) 
where the S,, are simple modules, quotients of M. 


Proof. Let M,; be a maximal submodule of M. If JM ¢ M,, then M, + JM = M, 
hence by Nakayama’s lemma (BA, Corollary 5.3.7), M; = M, which is a contra- 
diction. Thus J/M C M, and (4.2.1) follows. Assume now that R= R/J is semisimple; 
then M = M/JM may be regarded as an R-module and hence is semisimple, so we 
obtain (4.2.2), for a family S,, of simple modules. Combining the isomorphism 
(4.2.2) with the projection on S,, we obtain a homomorphism f,, : M- S,, whose 
kernel is a maximal submodule of M and so is of the form N,,/JM, where N,, is a 
maximal submodule of M. Hence NN,,/JM = 0, i.e. NN, = JM, and since the N,, 
form a subfamily of the M,, we now have equality in (4.2.1). Ps | 


A module homomorphism f : M > N will be called essential if it is surjective but 
its restriction to any proper submodule of M fails to be surjective. By a projective 
cover of a module M we shall understand a projective module P with an essential 
homomorphism P > M. The following lemma is useful for testing for essentiality. 


Lemma 4.2.2. Let R be any ring. Given an R-module M, a finitely generated projective 
R-module P and a surjective homomorphism a: P — M, if ker a C JP, then a is essen- 
tial. If R/] is semisimple, this sufficient condition is also necessary. 


Proof. Let P’ be any submodule of P such that P’a = M. Then for any x € P 
there exists x° € P’ such that xw = x’a, ie. x € P’ +kera. Thus P’+kera =P 
and by Nakayama’s lemma, P’ = P; this shows a@ to be essential. Suppose now 
that R/J is semisimple and a@ is essential. Let P; be any maximal submodule of 
P; if kera g P|, then P,; +kera@ =P, hence Pja = Pa =M, contradicting the 
fact that @ is essential. Therefore kera@~ CP, and now kera CMP; = JP, by 
Lemma 4.2.1. | + | 


Our first task is to prove the existence of projective covers in the Artinian case. 


Proposition 4.2.3. Let R be a left Artinian ring. Then any finitely generated left 
R-module has a projective cover. 


Proof. Let M be a finitely generated projective R-module; we can find a finitely 
generated projective module P mapping onto M. We choose P of shortest length 
and claim that in this case the homomorphism a: P > M is essential. For take a 
minimal submodule N of P such that z|N is surjective and let i: N — P be the 
inclusion map and put f =i2:N — M. Since P is projective, there is a map 
g:P-—N such that gf = 7; now x maps N onto M, so f maps im(g|N) onto M 
and by the minimality of N it follows that im(g|N) = N. Thus g|N is a surjective 


4.2 The projective cover of a module 141 


endomorphism of the module N of finite length, therefore it is an automorphism 
and so NM kerg = 0. Given x € P, we have xg € N and since g|N is an auto- 
morphism, we have xg=yg* for some y€P. Now x—ygekerg and 
x= yg +(x -— yg); this shows that P= N @®kerg. Therefore N is projective, but 
this contradicts the minimality of P, unless N = P. P| 


When a projective cover exists, it must be unique; this can be proved quite 
generally. 


Proposition 4.2.4. If P, Q are projective covers of a module M (over any ring R), with 
essential homomorphisms a: P— M, B:Q-—-M, then there is an isomorphism 
4: P— Q such that a = OB. 


Proof. Since P is a projective module and £ is surjective, there exists 8: P —» Q such 
that wa = 68. This map 4 must be surjective, because 8 is essential, and so P splits over 
ker 6, say P = Q, ® ker O, where Q; = Q. But then Q;a = Q,46 = M, hence Q; = P 
and so ker 6 = 0. Thus @ is an isomorphism, as claimed. a 


We shall denote the projective cover of a module M by P(M), bearing in mind 
that this operation P is not defined for all modules. There is a second operation 
closely related to P which is defined for all modules. For any module M we define 
its top as 


T(M) = M/JM, (4.2.3) 


where J = J(R) is the Jacobson radical. If M is finitely generated and T(M ) = 0, then 
M = 0; this is just the content of Nakayama’s lemma. When M is finitely generated, 
then the natural homomorphism t: M > T(M) is essential, for it is clearly surjec- 
tive, and if N C M, then N is contained in a maximal submodule Ny of M. We have 
No > JM, hence the natural map 1: M — M/N, can be factored as M > T(M) > 
M/No5, where v|No = 0, but this contradicts the fact that t|Ny is surjective. This 
shows T to be essential. When R is Artinian, or more generally, when R/J is semi- 
simple, then T(M) is semisimple by Lemma 4.2.1. 

We note that the projective cover, when it exists, may be obtained as the projective 
cover of its top: 


Proposition 4.2.5. For any ring R and any R-module M with a projective cover we 
have P(M) = P(T(M)). 


Proof. The composition of two essential maps is clearly essential and so we have the 
essential map 


P(M)-~- M- T(M). (4.2.4) 


so the result follows. | | 


Similarly, when forming the top, under the right conditions it does not matter 
whether we start from a given module or its projective cover. 


142 Algebras 


Proposition 4.2.6. Let R be a ring such that R/J is semisimple, and let M be a finitely 
generated R-module with a projective cover. Then 


T(P(M )) = T(M). (4.2.5) 


Proof. Write P = P(M) and let N be the kernel of the essential map a : P > M. By 
Lemma 4.2.1, N C JP and since q@ is surjective, it maps JP onto JM. Therefore 
T(P) = P/JP = P/N/JP/N = M/JM = T(M). 


In particular this shows that the projective cover of a simple module has a simple 
top. 


Exercises 


1. Verify that T is a functor. Under what circumstances is P a functor? 

. Show that the only finitely generated Z-modules with a projective cover are the 
free modules. 

. Show that T(T(M )) = T(M), P(P(M)) = P(M). 

. Show that if P(M) is indecomposable, then so is M. Does the converse hold? 

. Show that if @ : P > M is a projective cover and B : Q > M is surjective, where 
Q is a projective module, then Q = Py @ P;, where P; = P and B|P, corresponds 
to a, while B|Py = 0. Use the result to give another proof of Proposition 4.2.4. 


to 


1 of WwW 


4.3 Semiperfect rings 


We have seen that Artinian rings have many properties not shared by general rings, 
but there are certain classes of rings for which at least some of these properties hold. 
One such class is formed by the semiperfect rings, introduced by Hyman Bass in 
1960. We begin by examining the role of idempotents in Artinian rings. 

Any Artinian ring has finite composition length as left module over itself and so 
can be decomposed into a direct sum of indecomposable left modules. Such decom- 
positions can be described by the corresponding decompositions of 1 as a sum of 
idempotents. Let us recall that two idempotents e, f of a ring R are called orthogonal 
if ef = fe = 0; an idempotent e is primitive if e 4 0 and e cannot be written as a sum 
of two non-zero orthogonal idempotents. In general rings idempotents can be used 
to describe direct decompositions, as our first result shows. 


Proposition 4.3.1. Let R be any ring. Any decomposition of R as a direct sum of a finite 
number of left ideals 


R=a,@...@a, (4.3.1) 
corresponds to a decomposition of 1 as a sum of pairwise orthogonal idempotents 


l=e+...+6. 6 =e, ee =0 foriFj, (4.3.2) 


4.3 Semiperfect rings 143 


where a; = Re;, and for any idempotent e, Re is indecomposable if and only if e ts 
primitive. 


Proof. Given (4.3.1), we write 1 as Re,, where e, € a,. Then e, = >_, exe;; since the 
sum (4.3.1) is direct, we see that e,e, = 0 for 14k and e; = e,, so (4.3.2) follows. 
Conversely, given (4.3.2), put a,= Re; then any x¢€R can be written as 
x= )>xe,, where xe,€a, hence R=} °a, and this sum is direct, for if 
Y_ xe, = 0, then right multiplication by e gives x,e, = 0. 

For any idempotent e, Re is a direct summand of R: R = Re + R(1 — e); now the 
above correspondence shows that any direct decomposition of Re corresponds to 
writing e as a sum of two orthogonal idempotents. It follows that Re is indecom- 
posable iff e is primitive. o 


In the Artinian case one can construct complete decompositions (4.3.1) (i.e. into 
indecomposable terms) by writing the semisimple ring R/J as a direct sum of 
simple left ideals and then ‘lifting’ this decomposition to R. The essential step is 
to lift an idempotent from R/J to R; below we shall see how to do this for Artinian 
rings and then go on to describe a somewhat larger class of rings for which this is 
possible. 

Let R be any ring and % an ideal of R; an element u € R such that u- = u (mod 9) 
is called an idempotent mod Nt, and we say that u can be lifted to R if there exists 
e € R such that e =e and e=u (mod). For such a lifting to be possible one 
usually has to assume that N C J(R), but this by itself is not enough. For example, 
in the ring R of rational numbers with denominators prime to 6, J(R) = 6R and 3, 4 
are idempotents mod 6R, which however cannot be lifted to R. The next result gives a 
sufficient condition. We recall that a nil ideal is an ideal consisting of nilpotent 
elements. 


Lemma 4.3.2. Let R be a ring and Na nil ideal in R. Then idempotents mod Mt can be 
lifted to R. 


1 


Proof. Let 1 be an idempotent mod %; then (4% — u-)' = 0 for some m > 0. We have 


: ZAP Sait afte: 
I= [w+ (1 a9 Pt = uw) y 


1 


On the right the first 7 terms are divisible by 1”, while each term after the first #7 is 
divisible by (1 — 14)", so on denoting the sum of the first : terms by e, we can write 
1 =e+(1—u)'"g, where g is a polynomial in uv. Now u(1 — uw) € St, so 


stn ott ‘2m — 1 - 
ea" + QOm=Die™ "(1 =) + +( ; unc =" ' = u(mod NM), 
mM — 


and e(1 —e) =e(] — u)'"g = 0, hence e is an idempotent. Pe 


In an Artinian ring R the Jacobson radical J(R) is nilpotent, so in this case any idem- 
potent mod J(R) can be lifted to R, by Lemma 4.3.2, but there are other cases where 


144 Algebras 


this is possible and it is convenient to make a definition at this point, introduced by 
Hyman Bass {1960}: 


Definition. A ring R is said to be semiperfect if idempotents (mod J(R)) can be lifted 
to R and R/J(R) is semisimple. 

It is clear from the definition that this notion is left-right symmetric. Moreover, 
since for any ring R, R/J(R) has zero radical, the latter is semisimple iff it is right 
(or left) Artinian, by BA, Theorem 5.3.5. In particular, this establishes 


Theorem 4.3.3. Every left or right Artinian ring is semiperfect. | 


Of course the converse does not hold; the class of semiperfect rings is much wider 
than the class of Artinian rings (see Exercise 11). To describe semiperfect rings we 
shall need to know when two idempotents generate isomorphic ideals. Let us call 
two idempotents e, f in a ring R conjugate if R contains a unit wu such that 
f =u ‘eu. If there exist a € eRf, b € fRe such that ab = e, ba = f, we shall call e 
and f isomorphic. For example, conjugate idempotents are isomorphic, for if 
f =u7~'eu, we can take a = eu, b= u~'e and then find that a = euf, b= fu~'e 
and ab=e, ba=f. The next lemma clarifies the relation between these two 
concepts. 


Lemma 4.3.4. Let e, f be any idempotents in a ring R. Then 


(i) e, f are isomorphic if and only if eR = f R or equivalently, Re = Rf, 
(ii) e, f are conjugate if and only if e is isomorphic to f and 1 — e is isomorphic to | — f. 


Proof. (i) Assume that Re = Rf, say 6: Re — Rf is an isomorphism and suppose 
that 6 maps e to a, while 6~' maps f to b. Since a € Rf, we have af =a and 
e(e?) = e@, 1e. ea=a. Hence eaf =af =a and similarly fbe = b. Further, 
e=e00~' =ad~' = (af)@~' =a.f6~' =ab, and similarly f = ba, so e is iso- 
morphic to f. Conversely, if e, f are isomorphic, say ab=e, ba=f, where 
aéekf, be fRe, then x1~xa is a homomorphism from Re to Rf with inverse 
yt yb. Hence Re & Rf iff e, f are isomorphic, and by symmetry this is equivalent 
to eR = fR. 

(ii) If e, f are conjugate, say f = u~'eu, then as we saw, e and f are isomorphic. 
Further 1 — f = u~'(1 — e)u, so 1 — e, 1 —f are also isomorphic. Conversely, if e, 
f are isomorphic and 1—e, 1—f are isomorphic, say ab=e, ba=f, where 
a=eaf, b=fbe, and a’‘b’=1-e, b’a=1-f, a’ =(l-ej)a(1—-f), 
b’ = (1—f)b’(1 —e), let us put A=a+a’,.B=b+b’. Then b'(1—e) =b’, so 
b’e = O and similarly ea’ = 0 and hence BeA = bea + b’ea + bea’ + b'ea' = bea = f. 
For the same reasons B(1 — e)A=1-—f, hence BA =1. By symmetry AfB =e, 
A(1 —f)B=1-—eand so AB= 1. This shows e, f to be conjugate, as claimed. 


This lemma together with the Krull-Schmidt theorem (Theorem 4.1.6) shows that 
in an Artinian ring isomorphic idempotents are conjugate, so in this case iso- 
morphism and conjugacy for idempotents mean the same thing. 


4.3 Semiperfect rings 145 


We shall usually want to lift orthogonal families of idempotents; this can be 
accomplished without further hypotheses on the ring. 


Proposition 4.3.5. Let R be any ring and e, f idempotents in R. Write J = J(R) and 
denote the natural homomorphism R — R/J by x [x]. Then 


(i) e=0or 1 if and only if [e] = 0 or | respectively; 

(ii) e is isomorphic to f if and only if [e| is isomorphic to [f}; in particular, if 
le] = [f ], then e is conjugate to f; 

(ili) if ef = fe =O0(mod J), then there exists an idempotent f, such that f, =f 
(mod J) and ef; = fie = 0. 


Proof. (i) If e € J, then 1 — e is a unit and since e(1 — e) = 0, we have e = 0. Simi- 
larly if 1 — e € J and the converse is clear. 

(ii) Let a, b € R be such that a = eaf, b = fbe, ab =e, ba=f (mod J) and put 
a, = eaf, b, = fbe. Then a,b, = e — z, where z € eJe. Let z’ be the quasi-inverse of 
z (le. 2+2' = 22’ = 2’z) and put z” = ez’e; then z” is a quasi-inverse of z in eRe, 
and it follows that a,b\(e—z") =e. Putting a, =a), b, = b\(e—2z"), we have 
a, =a, b. =b (mod J) and azb, = e. Next write boa. = f — y; then y € fJf and 
since (b.a:)” = bea, = bya>, it follows that f—y=(f —y) =f? —fy—yf+ 
yo =f —2y+y-. Thus y(y— 1) = 0, and since y € J, we find that y= 0 and so 
b.a, = f. Thus e is isomorphic to f, the rest is clear. 

(iii) If ef = fe = 0 (mod J), then 1 — fe is invertible. Put 


fo = (1 — fe) -'fU — fel: 


then fp is an idempotent conjugate to f- Moreover, fo =f (mod J) and clearly 
foe = 0. Writing ff = (1 —e)fo. we have ff = fo —efy = fh —ef =f. =f (mod J) 
and fie = 0 = ef,; moreover, f; = (1 —e) fll —e) fo = (l—e)f5 =fi, so fi is the 
required idempotent. o 


f 


We shall use this result to lift decompositions of the form (4.3.2), again without 
further hypothesis: 


Proposition 4.3.6. In any ring R, let e)..... e, be a set of idempotents such that 
ee; = 0 (mod J) for i #j. Then there exist idempotents e' such that e| =e, (mod J) 
and e.e. = 0 fori #j. If moreover, 


l=e+...+e,, ee; =O(mod]), iF4, (4.3.3) 


then there exist idempotents e’ such that e| = e; (mod]), 1 = )/e; and ee, = 0 for 
VEE Y: 

Proof. For n = | there is nothing to prove, so we assume that n > 1 and use induc- 
tion on n. This means that we may assume eje; =O for 14#4j,1,j;> 1. Put 
e=e,+...+e,; then e is again idempotent and ee = ee; = 0 (mod J ). By Propo- 
sition 4.3.5 there exists an idempotent e} such that e; =e, (modJ) and 
ee, = e\e = 0. It follows that e), e2,...,e, are pairwise orthogonal idempotents. 


146 Algebras 


Assume now that (4.3.3) holds and choose the e; as in the first part. The 
e=e,+...+e' is an idempotent such that e = 1 (mod J), hence e = 1 by Propo- 
sition 4.3.5(i). ES 


The first part remains true (with the same proof) for a countable set of idem- 
potents, but it ceases to hold for uncountable sets (Zelinsky, 1954). 

Proposition 4.3.6 shows that every semiperfect ring R can be written as R = )> Re;, 
where the e, are primitive and so the Re; are indecomposable. To establish the 
uniqueness of such decompositions we shall want to apply the Krull—Schmidt 
theorem and we need to check that the endomorphism ring of an indecomposable 
left ideal Re is local. Here we shall need a couple of elementary lemmas: 


Lemma 4.3.7. A semiperfect ring in which I is a primitive idempotent 1s a local ring. 


Proof. The semisimple ring R/J can be written as a direct sum of a finite number of 
simple left ideals, and this decomposition can be lifted to R. Since 1 is primitive, it is 
primitive (mod J), hence there is a single summand and so R/J is a skew field. This 
means that R is a local ring. Ps | 


Lemma 4.3.8. Let R be any ring, e an idempotent in R and M a left R-module. Then 
there 1s an isomorphism of left eRe-modules: 


Home(Re. M) = eM. (4.3.4) 
In particular, taking M = Re, we obtain a ring isomorphism 


End; (Re) & eRe. (4.3.5) 


Proof. Each homomorphism a: Re > M is completely determined by its effect 
on e. If ea=u, then xa = (xe)w = x(ew) = xu; in particular, u = ea = eu € eM. 
Conversely, for any u € eM the mapping a, : xet> xu is a homomorphism from 
Re to M, and the correspondence ui>a,, is additive, as is easily checked. It is 
surjective, as we have seen, and if a, = 0, then u = eu = 0, so it is injective and 
hence an isomorphism of abelian groups, or more generally, left eRe-modules. 
This establishes (4.3.4). If M = Re, both sides acquire a multiplicative structure, 
which is again compatible with the isomorphism, so we have a ring isomorphism 


(4.3.5). bs 
Over a semiperfect ring R, the top of any finitely generated R-module M can be 
written 
T(M ) = ®S,,, 
where each S,, is a simple quotient of M, by Lemma 4.2.1. We shall use this remark to 


show that projective covers exist over a semiperfect ring. 


Theorem 4.3.9. Any’ finitely generated module over a semiperfect ring R has a projective 
cover. More precisely, the projective module P is a projective cover for M if and only if 
T(P) = T(M). 


4.3 Semiperfect rings 147 


Proof. Let P be a projective cover for M, say P/N & M, where N C JP, by Lemma 
4.2.1. Then JM = JP/N, hence T(M) = M/JM = P/N/JP/N = P/JP = T(P), so 
the condition is necessary. Now let M be any finitely generated left R-module and 
put R = R/J. As we have just seen, T(M) is a finite direct sum of simple R-modules, 
which may also be regarded as simple R-modules; further, any simple R-module 
has the form Re, where @ is a primitive idempotent in R. By definition of R, é lifts 
to an idempotent e in R, which is again primitive, by Proposition 4.3.5. Write 
P = Re for the corresponding indecomposable projective and put P = @P. Then 
T(P) = @Re & T(M). More generally, given any projective module P and an iso- 
morphism 6: T(P) — T(M), we have a diagram 


p ——+ P/JP + 0 
oe A 

’ 

M ——+ M/JM — 0 


Since P is projective, there exists a to make the diagram commutative. Given x € M, 
there exists a € P such that a6 = x, hence Pa + JM = M, and so by Nakayama’s 
lemma, Pa = M. Since @ is injective, ker a C JP, therefore by Lemma 4.2.2, P is a 
projective cover for M. o 


This result and its proot allows us to view finitely generated modules over a semi- 
perfect ring in a new light. With every such module M we associate on the one hand 
its top T and on the other its projective cover P. We have essential mappings 


P+ M, P-T, 


and P, T are the largest resp. smallest modules for which such essential mappings 
exist, for a given M. We conclude with an analogue of the Krull-Schmidt theorem 
for projective modules over semiperfect rings. 


Theorem 4.3.10. Let R be a semiperfect ring. Every finitely generated projective left 
R-module P can be written as a direct sum 


P=P,@...@P,. (4.3.6) 


where each P; is isomorphic to an indecomposable left ideal which 1s a direct summand 
of R, and the P; are unique up to isomorphism and order. 


Proof. We have seen that R has a direct decomposition into indecomposable left 
ideals, e.g. by lifting a complete direct decomposition of R/J. Hence for any n > 1, 
R" can be written as a direct sum of indecomposable modules isomorphic to left 
ideals; by Lemmas 4.3.8 and 4.3.7 each such left ideal has as endomorphism ring a 
local ring, therefore Corollary 4.1.5 can be applied to establish the uniqueness of 
this decomposition, up to isomorphism and order. Now if P is any finitely generated 
projective left R-module, we have P @ P’ = R" for some n > 1, and so we obtain a 
decomposition P @ P’ = @Q;, where each Q; is isomorphic to an indecomposable 
left ideal. If we now take a direct decomposition of P with the maximal number 


148 Algebras 


of terms and apply Proposition 4.1.4, we obtain a decomposition of the required 
form. | 


It can be shown that semiperfect rings form the precise class of rings for which 
every finitely generated module has a projective cover. A ring over which every 
left module has a projective cover is said to be left perfect. Such a ring R is character- 
ized by the fact that R/J is semisimple and J is right vanishing (or also left T- 
nilpotent), i.e. for any infinite sequence {a,} in J there exists n such that a,...a, = 0 
(see Bass [1960]). 


Exercises 


1. Show that an idempotent e in a ring R is primitive iff eRe is non-trivial and has 
no idempotents # 0, 1. 

2. Let R be a ring and a a minimal left ideal. Show that either a~ = 0 and aR isa 
nilpotent two-sided ideal in R, or a = Re for some idempotent e, and hence a is 
a direct summand in R. 

3. Find conditions on idempotents e, f for Re = Rf to hold. 

4. Show that if Re = Rf for idempotents e, f where e is central, then f = ef = fe. If 
fis also central, deduce that e = f. 

5. Let R be a local ring and P a finitely generated projective left R-module. Show by 
lifting a basis of T(P) that P is free. 

6. Let R be a semiperfect ring such that R/J is simple. Show that R is a full matrix 
ring over a local ring. If further, R is an integral domain, deduce that it must be a 
local ring. 

7. Let e be a central idempotent in a ring R. Show that if e; is an idempotent such 
that e; =e (mod J), then e, =e. 

8. Show that if 1 = }’e;= 5 °f; are two decompositions of | into orthogonal 
families of idempotents such that e; = f; (modJ), then v=) e;f; is a unit 
and f, = u~ le;u. 

9. Show that if R is semiperfect, then so is R,, for all n > 1. 

10. Show that a commutative Artinian ring is a direct product of completely primary 
rings (1e. every non-unit is nilpotent). Give a counter-example in the non- 
commutative case. 

1]. Show that a commutative ring is semiperfect iff it is a direct product of finitely 
many local rings. Show that this ring is (left and right) perfect iff the maximal 
ideal of each local factor is vanishing. 


4,4 Equivalence of module categories 


A natural question asks when two rings A, B have equivalent module categories. We 
recal] from BA, Section 4.4 that two rings A, B are called Morita equivalent, A ~ B, or 
simply equivalent if there is a category equivalence Mod, = Modg. There we saw too 
that any ring A is equivalent to A,, for all » > 1; however there are also other cases 
and in this section and the next we shall find precise conditions for A and B to be 


4.4 Equivalence of module categories 149 


Morita equivalent. We begin by describing the notion of a generator which plays an 
important role in what follows. 

By a generator in an abelian category .e/ one understands an object P in .e&/ such 
that h? = .</(P, —) is faithful. An equivalent condition is that every .e/-object is a 
quotient of a copower of P (i.e. a direct sum of copies of P). For if P is a generator 
and for any .-/-object X we put 

s= ||P. 


where P; = P and f runs over --/(P.X), with natural injection i; : Py — S; then 
the family of maps f : P} > X gives rise to a map F: S— X such that f = i7/F. 
To establish that X is a quotient of S we show that F is epic. Let g : X ~ coker F 
be the natural map. Then Fg = 0, hence fg = 0 for all f € .o/(P. X) and since P is 
a generator, it follows that g = 0. Thus coker F = 0 and F is epic, as claimed. 

Conversely, assume that X is a quotient of S=]]|P,, where P, = P.F:S > xX 
and write 1, : P, — S for the natural injection, Given f : X > Y, f #0, we have 
Ff #0 because F is epic, so i, Ff # 0 for some A, but 1,F € /(P,X). This shows 
P to be a generator. 

It is clear that in the category Modg of all right R-modules, R is a generator. The 
existence of a generator gives rise to a useful criterion for a natural isomorphism of 
functors. 


Theorem 4.4.1. Let .c/, @ be abelian categories with coproducts and let F, G be functors 
from sf to B which are right exact and preserve coproducts. If there is a natural trans- 
formation 


t:F—G, (4.4.1) 
such that for some generator P of .o/, the map 
tp: P’ > P® (4.4.2) 
is an isomorphism, then t is a natural isomorphism. 
Proof. For any X € Ob.c/ we have a short exact sequence 
Si 7-5, —- xX —> 0, 


where S,, S> are copowers of P. Applying F and G we have the following commuta- 
tive diagram with exact rows (by the right exactness of F and G): 


Si —> Sf —> x! —> 0 
Yt: ae pats 
Sv — Si — X" > 0 


By hypothesis $4 = (|| P.) =|] P!, hence t. = t,, is an isomorphism, and likewise 
t). By the 5-lemma, fy is an isomorphism, as asserted. | 


150 Algebras 


Let us now consider right exact functors Mod, — Mod, which preserve direct 
sums (coproducts). An example of such a functor is @,M, where M is an (A, B)- 
bimodule. The next result shows that this is essentially the only case: 


Lemma 4.4.2 (Eilenberg—Watts). For any functor S: Mod, — Modg between 
module categories the following conditions are equivalent: 


(a) Shas a right adjoint T: Modg ~ Mod,; 
(b) Sis right exact and preserves direct sums; 
(c) S = @,P for some (A. B)-bimodule P. 


Moreover, when (a)—(c) hold, then the right adjoint T is given by 
Y* = Homg(P. Y). 
where P = A°, and T is unique up to natural isomorphism. 


Proof. (a) = (b) follows by Theorem 2.2.7, since S is a left adjoint. 

(b) = (c). Given a functor S, right exact and preserving direct sums, put P = A°. 
Each A-endomorphism of A, induces a B-endomorphism of P, but the A- 
endomorphisms of A, are just the left multiplications by elements of A (BA, 
Theorem 5.1.3 or also Lemma 4.3.8 above); thus P is an (A, B)-bimodule. Now con- 
sider the functors S$ and @,P: both are right exact and preserve direct sums, so to 
show their isomorphism we need only, by Theorem 4.4.1, find a natural transforma- 
tion between them which, for the generator A of Mod,, is an isomorphism. Given 
X € Mody, we have a map 


X —> Hom,(A, X) —» Homa(A®, X°) & Homy(P. X°). (4.4.3) 


The result is a map fy : X — Homg,(P, X°) which is an A-module homomorphism. 
By adjoint associativity we have 


fy € Homs(X, Homa(P. X*)) X Homg(X ® P. X°). 


Each step in (4.4.3) is natural in X, so fy is a natural transformation from X @ P to 
X°. For X = A it clearly reduces to the identity, hence it is a natural isomorphism, as 
claimed. 

(c) => (a). Given (c), we define functor T : Modg — Mod, by 


YtoHom,(P.Y). 


ThenHom;(X°. Y) = Hom;(X @ P.Y) = Hom ,(X. Hom,(P.Y)) = Hom,(X.Y!), 
where each step is natural in X and Y. Thus T is the required right adjoint of S. 
When (a)-(c) hold, we have 


Y¥’ = Hom,(A. Y’) & Homa,(A°. Y) = Hom, (P.Y), 


where P = A”, and here P is clearly unique up to isomorphism. a 


We shall introduce a preordering of module categories on the basis of the next 
result: 


4.4 Equivalence of module categories 151 


Proposition 4.4.3. For any rings A, B the following are equivalent: 


(a) there exists a functor S: Mod, — Mods with right adjoint T : Modg —~ Mod, 
such that T has a right adjoint and ST = 1; 

(b) there exist functors S: Mod, — Modg, T : Modyz > Mod,, both right exact and 
preserving direct sums, such that ST = 1; 

(c) there exist modules 4Pp, 3Q, such that Pz, QA, as bimodules. 


If one (and hence all) of (a)-(c) holds, we shall call Mod, a quotient category of 
Mod, and write A ~ B. The functor S is called the section functor and T the retraction 
functor. 


Proof. (a) = (b) follows by Lemma 4.4.2 and (b) = (a) will follow if we show T to 
be right adjoint to S. This follows because we have the natural transformation 
Hom (X°. Y) —> Hom,(X°7Y!") ~ Homa(X, Y°). 


which is an isomorphism for X = A. 

Moreover, if (b) holds, then by Lemma 4.4.2, S = — @, P, T = — @, Q, for some 
aPz, pQ4, and ST = 1 means that P @g Q = A, so (c) holds. Conversely, this condi- 
tion clearly implies (A). o 


We list some properties of quotient categories: 
Proposition 4.4.4. Let A, B be rings such that A < B, with modules ,Pz, 5Q, satisfying 


PQ@Q=A. Then 


(1) Q= Homa(P, B), P = Hom,(Q, B): 

(il) A & Endr(P) & End,(Q): 

(i) Ps and ,Q are projective; 

(iv) 4P and Q, are generators; 

(v} we have the following lattice homomorphisms with right inverses (‘retractions’): 


Lat(A;) — Lat(P,) with 2-sided ideals of A corresponding to (A, B)-submodules 
of P, 


Lat(,A) — Lat(gQ) with 2-sided ideals of A corresponding to (B, A)-submodules 
of Q. 


Proof. In each case it is enough to prove the first part; the second then follows by 
symmetry. 


(i) Write S= @P, T = @Q; we have 
Homa,(P, B) = Homg(A°, B) & Hom,(A. B’) = Q. 
(11) We have the bimodule homomorphisms 
Endy(P) = Hom,(A°, P) = Homa(A. P’) © Homy(A, A) = A. 


Each term has a natural multiplication; all these correspond and give a ring 
isomorphism. 


152 Algebras 


(111) We have 
Homa(P, Y) X Hom, (A. Y!) & y. 


and by hypothesis the functor T: Yi Y’ is right exact. Hence Homa(P. —) is 
right exact and so Px is projective. 
(iv) We have P @ P’ ~'B, where 'B stands for a direct sum of |I| copies of B. Hence 


'O~'BEQ (PQQ) G(P' @Q)=AG(P OQ). 


This shows that Q is a generator, because A is one. 

(v) It is clear that the functor S induces a map from Lat(,A) to Lat(,P) which is 
order-preserving and has a right inverse, induced by T; further, A-bimodules 
( = ideals) of A correspond to (A, B)-bimodules in P. o 


Later, in Theorem 4.5.4, we shall find that the modules P and Q are actually 
finitely generated. For the moment we note that Proposition 4.4.3 leads to a criterion 
for Morita equivalence: 


Theorem 4.4.5. For any rings A, B the following conditions are equivalent: 


(a) Mod, = Mods, 
(a") sMod & 3Mod, 
(b) there exist bimodules .Pp, 3Q4 with bimodule isomorphisms 


P@pQO=A.. QOS4P= B. 


Moreover, when (b) holds and f : P@Q— A, g:Q@P-— B are bimodule iso- 
morphisms, these may be chosen so as to make the following diagrams commutative: 


{@1 cB) 
PQEQHP—AOP CSPORQW— BQ 
{lee l= Lies Lz 
P@B —> P Q@A — > Q 


Proof. The equivalence of (a), (b) is clear by the proof of Proposition 4.4.3, and now 
the equivalence of (a°), (b) follows by the evident symmetry of (b). Let us pick iso- 
morphisms f : P@Q-— A, g:Q®@P — B; then all the arrows in the diagrams are 
isomorphisms. Consider the first diagram. If we take p € P and move it (anticlock- 
wise) round the square we obtain 4p — P, where 6 is an (A, B)-automorphism of P. 
Now Endg(P) & End,(A) & A*, so @ is left multiplication by a unit « in A; since 4 
is also an A-automorphism, wu lies in the centre of A. If we replace f by uf, then 
the first square becomes commutative. We complete the proof by showing that 
with this choice of f, g the second square also commutes. For brevity write 
f(p @q) = (p, 4). g(q @ p) = [q. p]; we have adjusted f so that 


(p.q)p =pl4.p |, (4.4.4) 


and we must show that 


Iq.plq =q(p,q). (4.4.5) 


4.4 Equivalence of module categories 153 


Given p. p’ € P, q,q’ € Q, we have by the left and right B-linearity of g, 


[[q.p]q.p ) = [4.pllq’. pp’) = (4. pla’. p'l). 
By (4.4.4) and the fact that g is A-balanced, this is 


[q.pl4’.p )] = [4 (p.4’)p'] = [q(p. @'). p'). 


Thus if ¢ = {q, p]q’ — q( p,q’), then ¢ € Qand [c, p’] = 0 for all p’ € P. Let us define 
h:A— Q by h(a) = ca; then (h@ 1)g:A@P— Q@P-— Band this map is zero 
because [c, p’] = 0. But gis an isomorphism, so h @ 1 = h' = 0 and T is an equiva- 
lence, hence h = 0. Thus ¢ = h(1) = 0, as we had to show. Ped 


To illustrate the result, we take B = A,, for some m > 1. Then we may choose 
P=A", Q="A and it is clear that PE QX A, Q@P~A,. 

As a first consequence we see how to sharpen Proposition 4.4.4 for Morita equiva- 
lent rings. 


Corollary 4.4.6. [f A, B are Morita equivalent rings, with bimodules P, Q satisfying 
P@QZzA.Q@P=B, then Lat(A,) & Lat(Ps), Lat(,A) = Lat(;,Q), Lat(,B) = 
Lat(,P), Lat(B,) = Lat(Q.,).Moreover, 


Lat(,A4) & Lat(,P,) = Lat(,Bz): 


in other words, A and B have isomorphic ideal lattices. Bi 


By a Morita invariant we understand a property of rings which is preserved by 
Morita equivalence. For example, being simple is a Morita invariant, by Corollary 
4.4.6, 

As another example of a Morita invariant we have the centre. It is convenient to 
define this notion in the wider context of abelian categories. In any abelian category 
<7 let Nat(/) be the set of all natural transformations from the identity functor to 
itself. Since functors and natural transformations themselves form a category .o/%”, 
it follows that the set Nat(I), as the set of all “endomorphisms’ of I is a ring; it is 
called the centre of the category .°/. Explicitly, a € Nat(J) means that for each 
X € Ob./ there is a map ay : X > X such that any map f : X — Y in -/ gives 
rise to a Commutative square 


/ 
xXx — Y 
fe fw 
X— Y 
In the particular case where «/ = Mody, Nat(I) consists of all A-endomorphisms 


which commute with all A-homomorphisms. Writing C for the centre of the 
ring A, we have for each c € C an element ys, of Nat(I), defined as 


xu,=xce. forx eX, X € Modg. 


154 Algebras 


It is clear that the map yp: cl yw; defines a ring homomorphism 
C — Nat(/). 


We assert that this is an isomorphism: if uw, = 0, take X = A; then 0O= lu. =, 
so c= 0 and uw is injective. To prove surjectivity, let f € Nat(I), say If; =c. By 
naturality, ac = ca, hence c € C and so g =f — yw, € Nat(J). Now on A, g = 0 by 
definition, and given x € X, where X € Mody, we define gp: A — X by 11x. 
Then commutativity shows that g(x) = x.g¢(1) = 0, so g = 0 and f = y,. The result 
may be stated as follows: 


Theorem 4.4.7. The centre of a ring A is isomorphic to the centre of the category Mod,. 
Hence Morita equivalent rings have isomorphic centres. | 


This result shows for example that two commutative rings are equivalent iff they 
are isomorphic; more generally, a ring R is equivalent to a commutative ring iff it is 
equivalent to its centre. 

The property of being finitely generated can be expressed categorically: M is 
finitely generated iff M cannot be expressed as the union of a chain of proper sub- 
modules. This means that any module corresponding to a finitely generated module 
under a category equivalence is again finitely generated. However, the cardinal of a 
minimal generating set may well be different for the two modules, and this fact can 
be utilized to turn any problem on finitely generated modules into a problem on 
cyclic modules. In order to show this clearly we examine the equivalence A ~ A,, 
in greater detail. 

Fix 1 > 1 and write P = A"”,Q =" A. We have the functors 


Mi>M"=M@,P (M€ Mody). (4.4.6) 
and 
Ni>N- =N@Q (NE Mody). (4.4.7) 
Here N~ may also be defined as Hom, (P, N). It is easily checked that 
(MEP Sie Ney Cea. 


and this provides an explicit form for the equivalence A ~ A,. Given any finitely 
generated right A-module M, with generating set u,...., i“, Say, we can apply 
(4.4.6) and pass to the right A,,-module M”, which is generated by the single element 
(Wi aera u,,). We state the result as 


Theorem 4.4.8. For any ring A, any finitely generated A-module M corresponds to a 
cyclic A,-module under the category-equivalence (4.4.6), for suitable n. In fact it is 
enough to take n equal to the cardinal of a generatitig set of M. oi 


For example, if A is a principal ideal ring, any submodule of an n-generator A- 
module can be generated by elements (as follows from Proposition 8.2.3 below). 
Applying Theorem 4.4.8, we see that any submodule of a cyclic A,,-module is 


4.5 The Morita context 155 


cyclic, so A,, is again a principal ideal ring. In the opposite direction, if A,, is a prin- 
cipal ideal ring, then any submodule of a cyclic module is cyclic. It follows that any 
n-generator A-module can be generated by elements. This can happen for some 
n > | for certain rings which are not principal (see Webber [1970]). 


Exercises 

1. Show that a skew field K is Morita equivalent only to K,,1 = 1,2..... 

2. Verify directly that the centre of A is Hom, _ 4(A. A). 

3. Verify that for a ring to be Noetherian or Artinian is a Morita invariant. 

4, Show that A <« BS A® ~< B®. 

5. Show that any non-trivial ring without IBN has a simple homomorphic image 


without IBN. Verify that a simple ring without IBN is Morita equivalent only 
to a finite number of rings (up to isomorphism). 


4.5 The Morita context 


We have seen in Theorem 4.4.5 that a Morita equivalence between two rings A and B 
is determined by two bimodules P, Q, but in practice one wants to know for which 
pairs of rings A, B it is true that A ~ B. The first step is to find conditions on P and Q 
for Theorem 4.4.5 to apply. Here we need a general property of modules. 

For any right A-module M we define its dual as M* = Hom(M, A); this is a left 
A-module in a natural way. Let us write the image of x € Munder f € M* as (f. x) 
and put 


1M) = 1M) ={) “Ui ifieM*. xe Mt. 


This is a two-sided ideal in A, called the trace ideal of M. For example, if F is a non- 
zero free A-module, then r(F) = A. The modules for which t = A are of particular 
interest; as we see from the next result, they are just the generators of the category 
Mod,. 


Lemma 4.5.1. Let A be any ring. For any right A-module M the following are 
equivalent: 


(a) M 1s a generator, 
(b) T,(M ) = A, 
(c) M" = AON for some integer n and some N,. 


Proof. (a) = (b). Assume that t(M) = a #4 A; then the natural homomorphism 
mz: A-—» A/a is non-zero, hence by (a) the induced map 


M* = Hom(M.A) > Hom(M, A/a) (4.5.1) 


is non-zero. But every f € M* maps M into a by assumption, and this means that 
(4.5.1) is zero, a contradiction. Hence t(M) =A and (b) holds. 


156 Algebras 


(b) = (c). By hypothesis t(M) = A, hence there exist f),.... fee M5 ines 
u, € M such that 5° (f;. u;) = 1. We define a homomorphism y: M" — A by the 
rule (x),...,%,)h> S°(f,,x,). Its image in A is a right ideal containing 


1= 5 (fj. u,), hence y is surjective and if ker y = N, we have the exact sequence 
0—-—-N> M">A- 0. 


Since A is projective, this sequence splits and (c) follows. 

(c) => (a). Given a map f:X—- Y of A-modules, if the induced map 
Hom(M,X)— Hom(M,Y) is zero, then this also holds for the map 
Hom(M",X) — Hom(M". Y), ie. Hom(A @N,X) > Hom(A @N,Y). But the 
restriction to the first summand is just the original map f :X — Y (because 
Hom(A, X) = X), so f = 0. This shows h™! to be faithful, so (a) holds. | 


We now consider the following situation. Given two rings A, B and bimodules P, 
Q, assume that we have two bimodule homomorphisms: 


tT:PQ@Q—A, f.xtro(f.x), (4.5.2) 
uw: Q@P>B, x.frolx fi}, (4.5.3) 

such that 
Aor S=a oy), (4.5.4) 


f.geP, xypeQ. 
glx. f] = (g.x)f. (4.5.5) 
A P\ | . 
These rules may be expressed symbolically by saying that ( O 3) is a ring under 


the usual matrix multiplication. This sums up the module laws and (4.5.2), (4.5.3), 
while (4.5.4), (4.5.5) are instances of the associative law. The 6-tuple 
(A, B, P,Q. 1, w) is called a Morita context. We remark that im T is an ideal in A 
and im yp is an ideal in B. 

Starting from any module E, we obtain a Morita cantext as follows. We put 


be HontwthA).. -Bee-Endate), 


and regard E as a (B, A)-bimodule and F* as an (A, B)-bimodule in the natural way. 
Further, we have a natural map t: E* ® E — A given by evaluation, as in (4.5.2). 
To find jz, we use (4.5.4): for any x € E, f € E* we define |x, f] by its effect on E: 


inf peat) (Cy GE). 


This is an A-balanced biadditive map of E x E* into B and so defines a homomorph- 
ism  : E@ E* — B which is easily verified to be a B-bimodule homomorphism. 
Further, (4.5.4) holds by the definition of j4 and the definition of E* as B-module 
shows that for all y € E, (glx. f J. 9) = (g. [x. fly) = (eg. xf. y)) = (g. xf.) = 
((g,x)f.9); therefore g[x. f] = (g,x)f and so (4.5.5) is proved. 


4.5 The Morita context 157 


We thus have a Morita context (A, B. E*, E, t, w) starting from E,; this is called 
the Morita context derived from E,. To give an example, if A is a simple Artinian 
ring, say A & K,,, where K is a skew field (by Wedderburn’s theorem, BA, Theorem 
5.2.2) and E = K" is a simple right A-module, then E* = "K and the derived Morita 
context has the form (K,,K,"K,K".t. 2). 

For general Morita contexts we shall be interested in the case where T, 4 are iso- 
morphisms. In that case we have a Morita equivalence between A and B, by Theorem 
4.4.5, with functors 


S: Mod, — Modaz. MinM SA P, 


T : Modg ~ Mod;. Ni>N @ , Q, 


which are mutually inverse, because P ®@ Q = B, Q@ P = A. Under S, A corresponds 
to P and right ideals of A correspond to B-submodules of P, with two-sided ideals 
corresponding to (A. B)-submodules. Similarly, Q corresponds to B and under 
T,A corresponds to Q and P to B (see Proposition 4.4.4). Moreover, we have the 
isomorphisms 


Q = Hom,(P, A) X Hom, (P, B), P= Hom,(Q, A) = Hom,(Q. B), 


A & Endg(Q) = End;(P)", B® End,4(P) & Endg(Q)”. 


The next lemma shows that to verify that t or uz is an isomorphism it is enough to 
check surjectivity: 
Lemma 4.5.2. In any Morita context (A,B, P,Q.t.), tf a is surjective, it 1s an 


isomorphism; similarly for t. 


Proof. If 42 is surjective, we have 
De [x;, fi] =1 forsome x, € Q, f, € P. 
Now assume that 5° [y;. g,] = 0; then 
doy. © 2. = rz Oglx.fil = doy: Slgex = » HAGe) @ fi 
=) Dn gleoi =o. 


This shows yz to be injective, and hence an isomorphism. The same argument applies 


to T. | 


In the special case of a derived Morita context we can give an explicit criterion for 
ut to be surjective. We recall the dual basis lemma from BA, Lemma 4.7.5. In the 
finitely generated case (BA, Corollary 4.7.6) this states that ,P is a direct summand 
of A” iff there exist m,,.... u, € P, fi.....f, € P* (the ‘projective coordinate 
system’) such that 


vee (f,.x)u; forall x € P. (4.5.6) 


158 Algebras 


Similarly for a right module this equation takes the form x = >> u,(f,. x). 


Lemma 4.5.3. Given any module Q,, let (A,B, P,Q.t.) be the derived Morita 
context. Then np: Q@ P — Bis an isomorphism if and only if Quy 1s finitely generated 
projective. 


Proof. By the dual basis lemma just quoted, Q, is finitely generated projective iff 
there is a finite projective coordinate system 


pm a ui(f;,x) forall x EQ. 


Bearing in mind that Q* = P, we can by (4.5.4) write this as x = 5° [u,. f,]x for all 
x€Q,ie. S> [u;, f] = 1. But this is just the condition for jz to be surjective, and by 
Lemma 4.5.2, for yz to be an isomorphism. + | 


We can now State a condition on any module Q, for its derived Morita context to 
define an equivalence. 


Theorem 4.5.4. Let A be a ring, Q a right A-module and (A. B, P, Q. t, x) the Morita 
context derived from Q. Then this context defines a Morita equivalence between A and B 
if and only if Q is a finitely generated projective generator. 

A finitely generated projective generator is also called a progenerator. 


Proof. If we have a Morita equivalence, then Q, corresponds to By in the equiva- 
lence, so Q is a progenerator because B is. Conversely, assume this condition 
for Q. By Lemma 4.5.3 the map  : Q@ P — B is an isomorphism. Since Q is a 
generator, the trace ideal im r is A (Lemma 4.5.1), so T is surjective and hence an 
isomorphism by Lemma 4.5.2. Thus P®Q =A, and now Theorem 4.4.5 shows 
that we have indeed a Morita equivalence. Pe) 


This result shows that every Morita equivalence can be obtained from a particular 
Morita context. Given A ~ B, to find Q we need a finitely generated projective A- 
module; this is a direct summand of A” for some n > 1, and it may be specified 
by an idempotent e in A, & End,(A”). Suppose first that n = 1; this means that 
Q, = eA, where e is an idempotent in A. By Lemma 4.3.8 (or rather, its left-right 
dual) we have P = Hom,(eA. A) = Ae, B = End,(eA) = eAe. Now the condition 
for Q to be a generator reads: the natural map P®Q- A is surjective, ie. 
AeA =A. The translation to A,, is now clear. We choose n > 1 and take an idem- 
potent e in A, such that A,eA, =A,. Then we have B= eA, e, and all rings 
Morita equivalent to A are obtained in this way, with the appropriate Morita context 
(A, eA,e, AXe, e" A. T. js). 

An important particular case is obtained by starting from a commutative ring K 
say. If Q is any finitely generated projective K-module, then A = Endx(Q) is a 
K-algebra and for any K-algebra R we have 


R~ Enda(R @ Q) = ROA, 


4.5 The Morita context 159 


as is easily checked. Thus A ~ K and tensoring with A (over K) converts any 
K-algebra into one Morita equivalent to it. Such algebras are called Brauer equivalent; 
in the special case when K is a field, we have A = K,, and R®K,, = R,,, while the 
general case leads to the study of Azumaya algebras (Azumaya [1950], Auslander 
and Goldman [1960]; see also Section 8.6 below). 

We conclude this section by describing another important Morita invariant, the 
trace group. Let K be any commutative ring and consider a K-algebra A. With A 
we associate a K-module, its trace group, defined as 


T(A) = A/C. where C = ie (xy — yx}Ix.¥ € Al. (4.5.7) 


The natural map A — T(A) is called the trace function and is written tr(x). Clearly it 
has the following properties: 


T.l tr: A > T(A) 1s K-linear, 
T.2 tr(xy) = tr(yx) for all x, y € A. 


Moreover, any linear map a : A — M into a K-module such that a(xy) = @(yx) can 
be written as a(x) = a (tr(x)) for a unique a’: T(A) — M. Thus tr is the universal 
mapping satisfying T.1-T.2. Let us show that T is a Morita invariant: 


Proposition 4.5.5. For any rings A, B, if A~ B, then T(A) = T(B). 


Proof. Let (A, B. P,Q, ().[]) be the Morita context lis the equivalence and 
consider the map A — T(B) given by 


a: yO. i (>> ate : where f; € P. x, € Q. 


To show that this is well-defined we must check that the map f, x! tr([x, f}) is 
bilinear and B-balanced. The bilinearity is clear, and we have for any b € B, 


tec gb) js trl xf |b) tel bleh |) Sx, F |). 


by T.2, hence the result. Moreover, a(aa’) =a(a'a), because a( >> (f.x)a) = 
tr( >> [xa. f]) =tr( >> [x.af]) =a( do alf.x)); hence a@ induces a map 
a: T(A) + T(B). By symmetry there is a map fB : T(B) — T(A) and these two 
maps are easily seen to be mutually inverse. Ea 


As an illustration, if A is an algebra with centre K, then of the two Morita invar- 
iants C(A), T(A) the centre is just K, whereas T(A) is a K-module which in general is 
larger than K and so will tell us more about A. We remark that in (4.5.7) C is merely 
a K-module and not an ideal; this means that T(A) will often be non-zero even if A is 
simple. However, for some rings T(A) = 0 (see Section 9.3, Exercise 10). The trace 
function introduced here was first defined by Akira Hattori and independently by 
John Stallings in 1965, 

The notion of Morita equivalence was introduced by Kiiti Morita in 1958. The 
above account follows the author’s notes (Cohn (1966)), which in turn were based 
on the notes of Hyman Bass (1962). 


160 Algebras 


Exercises 


1. Given a Morita context (A, B, P,Q, 1, w), if P® Q =A and A & End(Q), show 
that Q is finitely generated projective. 

2. Let K be a commutative ring and P a finitely generated faithful projective module 
(P is faithful if Pa = 0 => a=0). Show by using the dual basis lemma that 
t(P) = K and so P is a generator. 

3. If (A.B. P,Q,t, uw) is a Morita context defining a Morita equivalence and 


A P 
R=( ) show that A< RB <R 
Q B 

4. Let K be any ring, E, F any K-modules and define P= Hom(E,F), 
Q = Hom(F, E), A = End(E), B = End(F). Show that together with the natural 
maps P@Q— A, Q®P — B this defines a Morita context. 

5. (Kazuhiko Hirata) Show that the Morita context of Exercise 4 defines a Morita 
equivalence between A and B iff there exist integers m,n > 1 and K-modules 
E’. F' such that E@ E’ & F", F@ F’ = E™. 


4.6 Projective, injective and flat modules 


We now take a closer look at flat modules and their relation to projective and injec- 
tive modules. 
Let R be any ring. Any left R-module M has a presentation 


G2 Re, (4.6.1) 


where F, G are free R-modules. Explicitly, let F have the basis (f;) and G the basis 
(g;); then @ is described by the matrix A = (a..), which is said to present M and is 
called the presentation matrix of M, where 


gia =) ax fi. (4.6.2) 


In general F, G will not be finitely generated and A may have infinitely many rows 
and columns, but each row has only finitely many non-zero entries; we say that A 
is row-firite. We note that M is determined up to isomorphism by A as the cokernel 
of the corresponding map a, by (4.6.1). Moreover, every row-finite matrix A defines 
a module in this way. If F, G have finite ranks m, n respectively, then the presentation 
matrix 18 1X H. 

It is clear that F can be taken to be of finite rank iff M is finitely generated. If G can 
be taken to be of finite rank, M is said to be finitely related and the module M is 
called finitely presented if F, G can both be taken to be of finite rank. Given two 
presentations 


where F|. F2 are free (but K,, K: need not be free), we have by Schanuel’s lemma 
(Lemma 2.4.2), F; ® K, = F. @ K,, hence if M has a presentation with F; of finite 


4.6 Projective, injective and flat modules 161 


rank and another with K: finitely generated, then F, K; are also finitely generated 
and M is finitely presented. 
Let us examine the presentation matrix of a projective module. 


Proposition 4.6.1. Let M be an R-module with the presentation (4.6.1). Then M 1s pro- 
jective if and only if there exists a mapping a’: F — G such that aa’'a =a. Thus a 
matrix A represents a projective module if and only if there is a matrix A’ such that 
AAA =A. 


Proof. Assume that M is projective; then (4.6.1) splits and so F=M @ker B= 
M @im a, so im @ is also projective. Hence the projection F > im a can be lifted to 
a map a’: F + G, whose composition with a is the projection onto im a, Le. 
aaa = a. 

Conversely, if a’ satisfies waw’a = a, then aq@’ is an idempotent endomorphism of 
G, hence G = G; @ Go, where G; = im aa’, Gy = ker aa’, and writing a; = a|G, we 
have the exact sequence (4.6.1) with G, @ replaced by G), a). Since aa’ = 1, the 
sequence splits and so M is projective. The final assertion follows by rewriting the 
result in terms of matrices. Ba 


Next we give some conditions for a module to be flat: 


Theorem 4.6.2. For any right R-module U (over any ring R) the following conditions 
are equivalent: 


(a) Tor; (U, —)=0, 

(b) U is flat, 

(c) for any free left R-module F and any submodule G of F, the map U @ G—> U @ F 
induced by the inclusion G C F is injective, 

(d) given uc = 0, where u € U",c © "R, there exists A € ""R” and v € U™ such that 
u= vA and Ac = 0, 

(e) for any finitely generated left ideal a of R the map U @a— U induced by the 
inclusion a C R 1s injective, so that U @ a & Ua, 

(f) as (e), but for any left ideal. 


Condition (d) may be expressed loosely by saying that any relation in U is a conse- 
quence of relations in R; this explains the name ‘flat’, if one thinks of relations in a 
module as a kind of torsion. 


Proof. It is clear from the definition that (a) and (b) are equivalent. When U is flat, 
the induced sequence 


00> URG U®@F 
is injective, so (b) = (c). To show that (c) = (d), consider the exact sequence 


(3 RR Sok 


where f: (x;)'—> >. x,c, and K = ker B. By (c) the induced sequence 
0- U@SK-U"-U 


162 Algebras 


is exact; since }> ujc; = 0, we have u, = >> vpan, for some v, € U, ay, € K, and so 
\~ anc; = 0, which proves (d). 

(d) = (e). Every element of U € a has the form )> u; ®c; for some u; € U. If 
> uc; = 0, then in vector notation, u = vA, where Ac = 0, hence in U@a we 
have u@®c = vA ®c=v@Ac = 0, and this shows the mapping U @a — U to be 
injective. 

Now (e) => (f) is clear since any relation involves only finitely many elements of a, 
and to prove (f) = (a), we have, for any left ideal a, an exact sequence 


0> U@a>U > U®@(R/a): 


hence Tort (U, C) = 0 for any cyclic left R-module C. Hence Tor<(U. A) = 0 for 
any left R-module A, by induction on the number of generators of A, using the 
exact homology sequence, and then taking the limit over finitely generated left R- 
modules A. | 


There is another characterization of flat modules, which clarifies their relation to 
projective modules. 


Proposition 4.6.3. Let U be an R-module (over any ring R) with a presentation 


ere eee cee Seren (4.6.4) 
where F is free. Then the following conditions are equivalent: 


(a) U1ts flat, 
(b) given x € K, there exists y: F — K such that xay = x, 
(c) given x),...,X, € K, there exists y: F > K such thatxjay = x, fori=1,..., 11. 


write x = fic) +... +f,c,. Hence xB = $° (f;B)c, = 0, so by Theorem 4.6.2(d) there 
exist £),..-.8m € F and ap; € R such that fiB = >° (gnB)am, >° anc, = 0. It follows 
that ft. =f; a > Bhan €K and x= > Te = > (f; se YS ham IC; = De Rae Now 
define y: F > K by 


Proof. (a) => (b). Let (f,) be a basis of F; by suitably renumbering the f’s we may 


f. Tord essa 


Q otherwise. 


fy =| 


Then xay = (> fic, lay = > fic; = x, as required. 
(b) = (c). By applying the same argument to the exact sequence of R,,-modules 


0 — kK" pare F” —~ ye = 0. 


and observing that U” is flat whenever U is, we obtain (c). 

Clearly (c) = (b) holds trivially, and to prove (b) = (a) we shall verify (d) of 
Theorem 4.6.2. Let (f,) be a basis of F and consider a relation in U, which by suitable 
numbering of the f’s may be written }°(f,B)c; = 0. Then x = S¢ fic; € K and by 
hypothesis there exists y: F > K such that xey = x. Put fy =f; + © fran; then 
since xay = x, we have )> fic, = D> (fiy)ei = >> fic; + Yo franic, and so ¥* anjc; = 0, 


4.6 Projective, injective and flat modules 163 


because the f, are free. Moreover, since f, y € K, we have f,B = >— (fi B)a;;, So Theorem 
4.6.2(d) is satisfied, and U is flat. | | 


If U is finitely related, we can in (4.6.4) take a finite generating set of K and so 
obtain a map y:F — K which splits (4.6.4). Hence U is then projective, and 
since any projective module is flat, this proves 


Corollary 4.6.4. A finitely related module is flat if and only if it is projective. B 


This shows that for example, over a Noetherian ring, every finitely generated flat 
module is projective. We can now also characterize rings over which every (left or 
right) module is flat (M. Auslander, 1957): 


Theorem 4.6.5. For any ring R the following conditions are equivalent: 


(a) every right R-module 1s flat, 
(a?) every left R-module 1s flat, 
(b) given a € R, there exists a’ € R such that aa‘a =a. 


Proof. We shall prove the implications (a°) > (b) = (a); the theorem then follows 
by symmetry. 
Let us apply Proposition 4.6.3 to the exact sequence of left R-modules 


0 — Ra—- R—- R/Ra — 0. (4.6.5) 


By hypothesis R/Ra is flat, so there exists a mapping y : R — Ra such that ay = a. 
Let ly = a’; then a = ay = (a.1)y = a.a‘a and (b) follows. Conversely, when (b) 
holds, the argument just given shows that (4.6.5) satisfies Proposition 4.6.3(c); 
more generally, this holds for any left ideal a in place of Ra; thus if u€a and 
uuu =u, we can define y:R—a by xi+xu'u. Hence R/a is flat and so 
Tor; (U. R/a) = 0 for any right R-module U, so the mapping U ® a > U induced 
by the inclusion a C R is injective. By Theorem 4.6.2(c) it follows that U is flat and 
so (a) holds. |= | 


A ring satisfying condition (b) of this theorem is called (von Neumann) regular or 
sometimes absolutely flat. Clearly every semisimple ring is regular, but of course the 
converse does not hold. For example, if K is any field and J a set, then the direct 
power K! is a regular ring, but it is not semisimple, unless J is finite. 

A link between flat and injective modules is provided by 


Proposition 4.6.6. A left R-module V (over any ring R) 1s flat if and only if 
V = Hom(V,K) ts injective, as right R-module, where K = Q/Z. 


Proof. For any right R-module U we have, by adjoint associativity, the natural 
isomorphism 


Homp(U,. Homz(V,K)) = Homz(U ® V, K). (4.6.6) 


164 Algebras 


Now assume that V is flat; then — ® V is an exact functor, hence the right-hand side 
of (4.6.6) is exact as a functor of U, hence so is the left and this means that V is injec- 
tive (by definition). Conversely, when V is injective, the two sides of (4.6.6) are exact 
as functors of U. But the functor Homz(—.K) is faithful, and so preserves inexact 
sequences, by Proposition 2.2.6, hence U @ V must be exact in U, so V is flat, as 


claimed. i 


For example, R is always injective; this module has another property that is some- 
times useful. We recall that a cogenerator U is defined dually to the term ‘generator’ 
by the condition that h,- is faithful; explicitly, for U € pMod, this means that for any 
rM and 04x €M there is a homomorphism y: M — U such that xy 4 0. For 
example, K = Q/Z is a cogenerator for Z. For, given any abelian group A and 
04x €A, let p be a prime factor of the order of x (or any prime if x has infinite 
order). Then xZ/pxZ is of order p and can be embedded in K; since K is injective 
(i.e. divisible in this case, see Section 2.3), this embedding can be extended to a 
homomorphism of A/pxZ into K; combined with the natural mapping 
A — A/pxZ this gives a homomorphism A — K which does not kill x. In the general 
case we obtain a cogenerator by forming a coinduced extension, as follows. 


Theorem 4.6.7. Let R be a ring and f:Z—R the canonical map. Then 
R= Hom(RK, K) is an injective cogenerator. 


Proof. For any right R-module M we have, by adjoint associativity, 


Homg(M, R) ~ Homz(M ® R, K) = Homz(M, K). 


and this is non-zero whenever M # 0, because K is a cogenerator for Z. It follows 
that there is a non-zero homomorphism from M to R. Hence for any x € M, 
x #0, we have a non-zero homomorphism from xR to R; since R is injective, by 
Proposition 4.6.6, this can be extended to a homomorphism from M to R. This 
shows R to be a cogenerator for R, and by Proposition 4.6.6 it is injective. Ci 


From the definition it is clear that the class of projective modules admits direct 
sums, While that of injective modules admits direct products. In the Noetherian 
case one can say a little more (E. Matlis, Z. Papp, 1958-59). A module will be 
called uniform if it is non-zero and any two non-zero submodules have a non- 
zero intersection. 


Theorem 4.6.8. Let R be a ring. Then the direct sum of any family of injective left 
R-modules is injective if and only if R is left Noetherian. When this is so, every 
finitely generated injective left R-module is a direct sum of uniform injectives. 


Proof. Suppose that R is left Noetherian, let {E,} be any family of injective left 
R-modules and put E = @;E,. We have to show that any homomorphism f : a > E 
from a left ideal a of R into E extends to a homomorphism of R into E (by Theorem 
2.3.4). Since R is left Noetherian, a is finitely generated, by u,,...,u, say and the 
images u,f,.... u,f have only finitely many non-zero components and so lie in a 


4.6 Projective, injective and flat modules 165 


submodule E’ = @, E;, where I’ is a finite subset of the index set J. Thus f maps a into 
E’; as a finite direct sum of injective modules, E’ is again injective, so f extends to a 
homomorphism of R into E’ and combining this with the inclusion E’ C E we obtain 
the desired extension of f. 

Conversely, assume that any direct sum of injective modules is injective and 
consider an ascending chain of left ideals in R: 


er On oa (4.6.7) 


Write a = Ua,, let I, be the injective hull of R/a,,, put I = @I,, and define f: a > I 
as xf = )- xf, where f,, : a > I, is the homomorphism induced by the natural map- 
ping a—> a/a,. If x € a, we have x € a, for some r= r(x) and so xf, = 0 for all 
n > r; hence only finitely many of the xf, are non-zero and f is well-defined. By 
hypothesis I is injective, so there is a homomorphism f’: R — I extending f. Let 
If’ =c el; thenc € 1, +...+ 1, for some s and so f’ followed by the projection 
on I,, is zero for n > s. Hence the same is true of f, so any x in a must lie in a,, 
i.e. a= a,. Thus (4.6.7) breaks off and this shows R to be left Noetherian. 

Clearly every indecomposable injective is uniform, so the last part follows in the 
Noetherian case. | 


The corresponding problems for projective and flat modules have been solved by 
Stephen Chase [1960]. A ring R is said to be left coherent if every finitely generated 
left ideal in R is finitely related. Now Chase proves (i) the direct product of any 
family of flat lett R-modules is flat iff R is left coherent, and (ii) the direct product 
of any family of projective left R-modules is projective iff R is right perfect and left 
coherent. 

For commutative Noetherian rings we can describe the indecomposable injective 
modules in terms of prime ideals of the ring. We recall from BA, Section 10.8 that a 
prime ideal p of a commutative ring R is meet-irreducible (this is easily verified 
directly), hence E(R/p), the injective hull of R/p, is indecomposable. Further we 
recall that a maximal annihilator of a module M is a prime ideal; the set of all 
prime ideals which form annihilators of elements of M is just Ass(M), and this 
consists of a single element p precisely when 0 is primary in M. 


Theorem 4.6.9. Let R be a commutative Noetherian ring. Then 


(i) for any prime ideal p of R, E(R/p) is an indecomposable injective module, 
(ii) any indecomposable injective module E is isomorphic to E(R/p), for a unique 
prime ideal p. 


Proof. Let p be a prime ideal in R. Then p is meet-irreducible, hence E(R/p) is inde- 
composable injective. If E is an indecomposable injective R-module, then 0 is meet- 
irreducible in any submodule of FE, hence 0 is primary and so Ass(E) consists of a 
single element p. Clearly R/p is embedded in FE, hence E(R/p) & E, and p is clearly 
unique. | 


Although this result has been extended for certain non-commutative rings, there is 


166 Algebras 


no corresponding description in the general case. However, there is a connexion with 
divisible modules which is quite general and which we shall now explain. 

We recall that a left R-module M over an integral domain R is said to be divisible 
if for any u € M, a € R* the equation u = ax has a solution for x in M. It is easily 
verified that over an integral domain every injective module is divisible (see BA, 
Proposition 4.7.8). Let us examine more closely how the hypothesis that R is a 
domain was used. Given u € M, a € R*, we need to be sure that the mapping 
ari ur from aR to M is well defined, and this will be the case if 


ax =0=> ux=0 forall xeER. (4.6.8) 


Let R be any ring and M a right R-module such that for any u € M and anya@e R, 
the equation u = va has a solution v in M whenever (4.6.8) holds. In that case M is 
said to be 1-divisible. Over an integral domain this reduces to the notion of a divi- 
sible module and the proof of Proposition 4.7.8 of BA can be adapted to obtain 


Proposition 4.6.10. Over any ring R, an injective module is 1-divisible; over a principal 
ideal domain the converse holds too. 


Proof. This will follow from the more general result in Theorem 4.6.11 below. 


In general, being 1-divisible is of course not sufficient for injectivity, but a neces- 
sary and sufficient condition is now easily obtained. We recall that for any index set 
I, M' denotes the direct product of copies of M indexed by I, while 'M denotes the 
direct sum of I copies. We shall visualize the elements of M’ and /M as rows and 
columns respectively; thus for any u € M ! x €/Rwecan form ux = >- u,x;, because 
almost all components of x are 0. 

A right R-module M is called fully divisible if it satisfies the following generaliza- 
tion of (4.6.8): 

Given any set I, if u € M', a € R! are such that 


ax =O ux=0 forall x e’R, (4.6.9) 


then there exists v € M such that u = va. 
Now the characterization of injective modules can be stated as follows: 


Theorem 4.6.11. A module M over a ring R is injective if and only if it is fully divisible. 


Proof. The necessity follows easily: given u € M’, a € R! satisfying (4.6.9), let a be 
the right ideal generated by the components of a = (a;) and define a homomorphism 


a—> M by 
\- Aa;X,; '> > 14;X;. 


By (4.6.9) this is well defined, clearly it is a homomorphism, so it can be extended to 
a homomorphism R —> M, because M is injective. If 1i> v in this mapping, then 
a;\— va; = u,, and this shows M to be fully divisible. 

Conversely, if M is fully divisible, let a be any right ideal of R and (a;) (ie I) a 


4.6 Projective, injective and flat modules 167 


generating set of a. Suppose we have ahomomorphism f : a > M and let ajf = uj. 
If the xj— R are such that S ajx;=0, then YS oujyx; = > (aif )x) = 
( >> aix;)f = 0, hence (4.6.9) holds and by full divisibility there exists v € M such 
that u; = va;. Now the map x1 vx of R into M extends f, because a; '—> va, = uj, 
hence }°a;x; 1 )° u;x; for any family (x,) € /R. By Baer’s criterion it follows that 
M is injective. + | 


Let us call M finitely divisible if for any integer n > 1 and anyue M",AER, 
satisfying the condition 


Ax=03 ux=0 forall xe "R. (4.6.10) 


there exists v € M" such that u = vA. Essentially this states that M" is 1-divisible, as 
R,,-module, for all 1. Then we have 


Corollary 4.6.12. A right R-module M over a right Noetherian ring R is injective if and 
only if it is finitely divisible. 


Proof. The necessity is clear; to prove the sufficiency, we observe that by Morita 
equivalence we have for all n > 1, 


M = Homa(R, M) = Home,("R,"M). 


Let a be a right ideal of R, generated by n elements, say. Then a corresponds to a 
submodule T of "R generated by a single element (namely the column with the 
generating set of a as components). As in the proof of the theorem, any homo- 
morphism T — M extends to a homomorphism "R —~ "M. Thus in the commuta- 
tive diagram 


Hompr(R, M) —+ Home(a,M) > 0 
= ce 
Home, ("R."M) —> Home (T."M) > 0 


the bottom row is exact; hence so is the top row. Now the result follows again by 
Baer’s criterion. gi 


Exercises 


1. Show that a module is flat whenever every finitely generated submodule is 

contained in a flat submodule. 

. Show that a flat module over an integral domain is torsion-free, and that the 

converse holds over a principal ideal domain. 

3. Show that a direct sum of modules is flat iff each term is flat. 

4. Use Proposition 4.6.6 to show that Mz is flat iff the canonical map M@a—> M 
is injective for every finitely generated left ideal a of R. Hence obtain another 
proof that (b) <> (d) in Theorem 4.6.2. 

5. Show that every finitely related module is a direct sum of a free module and a 
finitely presented module. 


IO 


168 Algebras 


6. Show that every finitely generated projective module is finitely presented. 

7. Show that every simple module over a commutative regular ring is injective. 

8. Let R be a commutative integral domain with field of fractions K. Show that K is 
flat as R-module, but not free, unless K = R. When is K projective? 

9. Show that every right Noetherian regular ring is semisimple. 

10. Let E be an injective cogenerator and for any module M define 1I°(M) = 
EHom(M-F) Show that [°(M) is injective and that M may be embedded in it. 

11. Let M bea flat module. Show that if M has a resolution of finite length by finitely 
generated projective modules, then M is projective. 

12. Show that every finitely generated projective module over Z/p” is free. 

13. Show that every non-zero projective module has a maximal (proper) submodule. 

14. Let M be an R-module with a presentation (4.6.3) in which F; is free and K;, is 
finitely generated, and another where F; is finitely generated (but not necessarily 
free). Show that M is finitely presented. Give an example to show that if F) is 
finitely generated free and K> is finitely generated (but F, is not free), M need 
not be finitely presented. 

15. Show that a presentation matrix A represents a projective right R-module iff 
there exists a matrix C such that AC = I. Show that A represents a flat module 
iff for any finite set of rows of A there exists C such that the corresponding rows 
of AC have the form I. 


4.7 Hochschild cohomology and separable algebras 


Let K be a commutative ring and A a K-algebra; when we speak of an A-bimodule M, 
it will be understood that the two K-actions on M agree, i.e. am = ma for all m € M, 
a@ € K. It is then clear that an A-bimodule is the same as a right (A° @ A)-module, 
where A® is the opposite ring of A. We shall write A‘ for A° @A and call it the 
enveloping algebra of A. We shall see in Chapter 5 that for a finite-dimensional 
simple algebra the enveloping algebra is a full matrix algebra over the centre. 

The free right A*-module on one free generator is A° ® A, with multiplication rule 


(x @v)(a @ b) = ax @ yb. 


Similarly A itself is a right A°-module with multiplication rule x(a @ b) = axb, and 
the multiplication mapping 4:A@A-—->A defined by (x®y)u=xy is an 
A*-module homomorphism. Its kernel is again denoted by &, as in Section 2.7, so 
that we have an exact sequence 


6-302 ABA-S ASO (4.7.1) 


If this sequence of A*-modules splits, A is said to be separable over K. We shall soon 
see that this is consistent with the use of this term in BA, Section 11.6. We begin by 
giving some equivalent conditions for separability: 


4.7 Hochschild cohomology and separable algebras 169 


Proposition 4.7.1. Let K be a commutative ring and A a K-algebra, with enveloping 
algebra AS = A® @ A. Then the following conditions are equivalent: 


(a) A is separable, 
(b) A is projective as A‘-module, 
(c) there exists e € A® such that eu = 1 and ae = ea forallae A. 


Proof. (a) < (b) is clear from the definition of projective module. Now assume that 
A is separable and let A : A > A @ A be a section, i.e. an A*-module homomorphism 
such that Aw = 14. The image of 1 € A under 4 is of the form e= >° p; ® q; and 
since Au = 1, we have eu = > pjq; = 1. Further, for any ae A, ae=alaA= 
(a.1)A = (l.a)A = 14.1 = ea, and (c) follows. If (c) holds and e = >¢ p; ® q;, then 
A: x1->xe = ex defines a splitting of (4.7.1) and it follows that A is separable, 
i.e. (a). a 


The element e in (c) is called a separator of A, or also separating idempotent; it is in 
fact an idempotent (see Exercise 1). It has the following property: 


Proposition 4.7.2. Let e € A be such that eu = 1. Then e is a separator for A if and 
only if (ker w)e = 0. 


Proof. Let u; be a generating set for A as K-module. We claim that 
ker 2 =) (1 @1-1@u,)A’. (4.7.2) 


For suppose that w= > u; @a, belongs to the left-hand side of (4.7.2), Le. 
wu = 0; then 5° 4,4; =0 and so w= >> (nu, @1—1@4u;,)(1 @a,), which lies in 
the right-hand side. The converse follows because (u, ®1—-1®@m,)u=0 and yu 
is an A-module homomorphism. Now the conclusion follows because 
(u, @1—1@u;)e= une — eu. | + | 


The projective dimension of A as A-module is sometimes called the bidimension, 
written bidim A; thus bidim A = 0 iff A is separable. Let A be a K-algebra; any 
A-bimodule M can be regarded as a right A‘-module in the usual way: 
(a @®b) = amb, or also as left A*-module, by the rule (a@b)m=bma. We 
define the n-th homology group of M as the derived functor of M > A* ® M, namely 


H,,(A, M) = Tor? (A, M), 


where M is regarded as left A°-module. Secondly we define the n-th cohomology group 
of M as the derived functor of Mi Hom ,-(A, M): 


H"(A,M) = Ext'.(A.M), 


where M is regarded as right A°-module. These groups were first introduced (when K 
is a field) by Gerhard Hochschild in 1945 and are called the Hochschild groups of M 
and A. 


170 Algebras 


To compute these groups we construct the standard resolution for algebras as 
follows: 


( a F : 
IG EI ay eS as eS): (473) 


where S,, is the (1+ 2)-fold tensor product of A with itself over K, defined as A- 
bimodule in the natural way; thus the A*-module structure is given by 


(x9 @X, @... OX: 41) (a @ b) = axynp OX, @... OX, +16. 


The map ¢ is the multiplication and the differential d: S, — S,,_; is given by 
(x) @...@Xyei)d = yt —1)'xp @... QxXjXja) @-.. OXpey. 


It is clearly an A‘-homomorphism, and d- = 0 can be checked as in Section 3.1, 
using the associativity of A. To show that (4.7.3) is acyclic, we define a homotopy 
h:S, > S,.; by ute 1 @u (ue S,), together with 7 : A > Sy given by x1> 1 @x. 
Then it is easily verified that dh + hd = 1, ne = 1, hhd + en = 15. 

Now S$, =A @A is free of rank 1, while S, =A @S,_>@A=S,_>@A’* is 
A‘-projective whenever S,, - > is K-projective, e.g. when K is a field. Thus when A is K- 
projective, (4.7.3) is a projective resolution of A and may be used to compute the 
Hochschild groups. We note that H°®(A,.M) = Hom,.(A,M) = {u € Mlau = ua} 
for all a € A; this group is also denoted by M*. 

The 1-cocycles are functions from A to M such that 


f(xy) = xfly) + flx)y. 
i.e. derivations, while coboundaries are inner derivations: 
f(x) = xa — ax. 
The 2-cocycles are functions f : A> > M such that 
fog. 2) + f(xy )2 = fcye) + of G2); 

while a 2-coboundary is of the form 

f(x, y) = xg(y) — glxy) + glx) y. 
To interpret H~(A,N), let us consider algebra extensions of A, where A is K- 
projective, i.e. short exact sequences 

eae tener; mney eer) (4.7.4) 


where B is a K-algebra with A as quotient and N as kernel. For simplicity we shall 
take N to be contained in B, so that @ is the inclusion mapping. Let y: A > B be 
a K-linear mapping left inverse to B (which exists since A is K-projective). If y is 
a homomorphism, then B is the direct sum of N and the subalgebra Ay and we 
shall say that the sequence (4.7.4) splits. In general the failure of y to be a homo- 
morphism is measured by 


f(a. b) = (ab)y — (ay)(by). (4.7.5) 


4.7 Hochschild cohomology and separable algebras 171 


Since B is an algebra homomorphism inverse to y, we have f(a. b)B = ab — ab = 0, 
hence f(a, b) € N. Let us assume that N~ = 0; then N, which is a B-bimodule, may 
be regarded as a (B/N )-bimodule, i.e. an A-bimodule. In that case B is completely 
determined by A.N and the function f : A” > N, as the K-module AQ@N with 
the multiplication 


(a, u)(b,v) = (ab,ub+av+f(a.b)) (abe AuveN). (4.7.6) 


As one might expect, f satisfies a factor set condition, derived from the associative 
law. Write (4.7.5) as (ab)y = (ay)(by) +f(a,b); then (ab.c)y = (ab)y.cy + 
flab, c) = ay.by.cy + f(a, b)c+ f(ab.c). Similarly, (a.bc)y = ay.by.cy + af (b,c) 
+f (a, bc), hence 


f(a, b)c+ flab, c) = af(b,c) + f(a, be): 


this shows f to be a cocycle. The extension splits iff there is a section y for B which is 
a homomorphism. This means that there is a mapping g:A— N such that y, 
defined by ay = (a, —g(a)) is a homomorphism, i.e. by (4.7.6), 

ay.by = (ab, —g(a)b — ag(b) + f(a, b)) = (ab)y = (ab, —g(ab)), 
whence 


f(a, b) = ag(b) — g(ab) + g(a)b. 


This is just the condition for f to be a coboundary, so we get 


Proposition 4.7.3. Let A be a K-algebra which 1s projective as K-module, and let N be 
an A-bimodule. Then the set of isomorphism classes of algebra extensions of A bv N as 
ideal with zero multiplication is in natural bijection with H -(A.N). |» | 


In a similar way the first cohomology group describes the module extensions. 
Given two right A-modules U, V, we can form Homx(U.V), which inherits the 
right A-module structure from V and a left A-module structure from U and a veri- 
fication, as in Section 2.3, shows that we have an A-bimodule structure. We now 
have the following result: 


Theorem 4.7.4. Let A be a K-algebra and U, V right A-modules, where A and U are 
projective as K-modules. Then 


H"(A, Homx(U, V)) & Ext) (U.V). (4.7.7) 


Proof. We start from the situation (,A,, ,«U4, . V4). As we have just seen, there is a 
natural A-bimodule structure on Hom,(U, V) and as is easily verified, we have a 
natural isomorphism (essentially by adjoint associativity) 


Hom, (A, Hom, (U, V)) = Hom,(U &4 A, V) = Hom,(U, V). (4.7.8) 


Let X be a free resolution of A as A‘-module; then as free A-bimodule X,, has the 


172 Algebras 


form A @x F;, @x A, where F,, is a free K-module. Now the tensor product (over K) 
of free K-modules is free, hence the tensor product of projective K-modules 
is projective, so X, =(A@x F,) @x A is right A-projective and U@,X, = 
(U @x F,) @x A is right A-projective. By (4.7.8) we obtain an isomorphism of 
complexes 


Hom, (X, Homg(U. V)) & Homy(U @4 X. V). 


The complex on the left has cohomology groups Ext.(A. Homx(U, V)) = 
H"(A, Hom(U, V)). On the right, since clearly Tor?(U, A) = 0 forn > 1, U @4X 
is a resolution for U. As we saw, it is A-projective, and so we obtain for its cohom- 
ology group Ext(U, V) and (4.7.7) follows. a 


We observe that the hypotheses of Theorem 4.7.4 on A and U are satisfied when K 
is a field. The results of Theorem 4.7.4 and Proposition 4.7.3 can now be combined 
to establish one of the main theorems on the splitting of algebra extensions: 


Theorem 4.7.5 (Wedderburn’s principal theorem). Let B be a finite-dimensional 
algebra over a field k and let I be a nilpotent ideal in B such that bidim B/I < 1. 
Then B has a subalgebra A which is a complement of I as k-space: 


B=A®OQIl. (4.7.9) 


Proof. If I- = 0, this follows from Proposition 4.7.3, because H*(B/I, 1) = 0, so we 
shall assume that I~ 40 and use induction on the dimension of B. The algebra 
B' = B/I* has lower dimension than B and B’/I/I° = B/I, hence by the induction 
hypothesis there exists a subalgebra C of B such that C D I~ and 


B=C4+I, CnNle=l-. 


Now C/I- = C/(INC) = (C+ 1)/I = B/I satisfies the same hypothesis; since I is 
nilpotent, I- Cc I, hence C C B and applying induction again, we find a subalgebra 
A of C such that C=A@I~°. Now B=C+I=A+4+I°+1=A+I and 
ANI=ANCNI=ANI- =0, therefore (4.7.9) holds. > | 


In particular, taking I to be the Jacobson radical J(B) of B, we see that J(B) is com- 
plemented whenever B/J(B) is separable. In that case the result can be proved more 
explicitly as follows. Let e = >> p; ®q; be a separator for B/I. Given the cocycle 
f(a, b) arising from a section, we put g(a) = >> f(a, p;)q;. Then 


ASO) = BIBONCESIG 2 > afb. Pi)qi — > flab, Pidqi + Y- fla, pi)q.b 
= De: b) Didi = Ie: bpi)4qi +- bie p.)qib. 


The last two terms cancel because be = eb and the first reduces to f(a, b) because 
eu = 1; hence the right-hand side is just f(a. b) and this shows f to be a coboundary. 

It remains to determine the separable algebras. In the first place we note that every 
separable algebra over a field is finite-dimensional. This follows from 


4.7 Hochschild cohomology and separable algebras 173 


Proposition 4.7.6 (Villamayor—Zelinsky, 1966). Let A be a separable K-algebra. If Ax 
is projective, then it 1s finitely generated as K-module. 


Proof. Since A is separable, it is projective as A‘-module, hence the same is true of 
A’. Suppose that A° @ B = F, where F is a free K-module with a basis u; (A € A) 
and write 2: F — A® for the projection. Let a, € F* be the dual basis and put 
B;, = a@;|A°; then we have 


a (By. x) u;, ok B;,x)m(u;) forall x € A®. (4.7.10) 


The mapping (x, y)i~>(B,,x)y from A° x A to A is bilinear, hence there exists 
y,,. AS — A such that 


CB) 7 S(O, 20 5"): (4.7.11) 
Moreover, for any z € A, 
(Yr. (xX @y)z) = (Gy. x @ yz) = (B,.x) yz = (Gx @ ye. 
Hence by linearity we have 
(y,.wz)=(y,.w)z forall we A’. zEA., (4.7.12) 


and for any w, z the two sides of (4.7.12) vanish for almost all A, because this is true 
of the B;. Thus gy; is a right A-module homomorphism. We claim that there exists a 
family w,, € A°(A € A) such that for any we AS, A(w) = {A € A|(g,,. w) £ O} is 
finite and 


w= > wi(Q;.W). (4.7.13) 


This is an analogue of the dual basis lemma (BA, Lemma 4.7.5). We define 
w;, = 1(u;,) @ 14; then for any x € A°, ye A, 


xOy= > (Bix x)r(u,) y= S- x(u;) @ (B,, x) 


= ‘> Ww, (Q;.X @ 3’). 


by (4.7.11); now (4.7.13) follows by linearity. 
Now let e = >- p; ® q; be a separator for A. By definition eu = 1 and ex = xe for 
x € A. Hence for any y € A, 


v= ler S ey Sle p= \>. w;(Q;, ey) | by (4.7.13). 


_ bs w, (i. ¥e) | = bs Ww, Me (;.. Ypia | 


because 2 is a bimodule homomorphism. Further, 


174 Algebras 


b> WwW. \- (Qi. ypi)4 | = ps Wi S GiY. ypi) | 
= > (w;)9i( B:.. Yp;). 


Hence A is generated by the family (w; j2)q;, where A ranges over the finite set A(ey), 
but A(ey) C L(e), which is finite and independent of y. Thus we have found a finite 
generating set for A. i 

For the rest of this section we shall confine ourselves to algebras over a field. Our 
aim will be to show that an algebra over a field k is separable iff it is semisimple and 
remains so under all extensions of k. We begin with some generalities. 


Proposition 4.7.7. (i) If A, B are separable algebras over a field k, then so are A@B 
and A @ B. 

(ii) Given a k-algebra A and a field extension F of k, the algebra Ap = A @, F 1s 
separable if and only if A is. 

(iii) For any field k and any n > 1, the full matrix ring k,, 1s separable. 


Proof. (i) The separability may be described by the existence of a section 4 for the 
multiplication yw. If 4,4, 4, are sections for the multiplication in A and B respectively, 
then A, +Azp:AQ@B— (A @A)@O(A@B)@O(BOQA) @(BOB) is a section for 
the multiplication in A@B, while A,@Az,p:A@®BrASBOAQBE 
A®A®B@Bis a section for A @ B. 

(ii) Let e = >° p, @ q, be a separator for A; then e is still a separator for Ag, for 
et = 1 and ae = ea continue to hold for a € Ar. Conversely, if A; is separable, 
choose a basis u, for F over k such that uw) = 1 and write the separator for Ax as 


e= > © 1, @ ty, where P;,, qij € A. 


Then >> p,,q.,4, = 1, hence equating coefficients of up we find that )> ping = 1. 
Further, for any a € A we have ae = ea, ie. )\ ap;, ® qij ® uj = D_ Pi @ Gia @ uj 
and on equating coefficients of uy we find that >> apy, ® qo, = > Po; © Jo, hence 
€y = )- Poj ® qo, is a separator for A, showing A to be separable. 

(iii) If A= k,, then AS = k, @k, = k,:, hence any A-module is semisimple, and 
so A is separable. | 


We can now describe separable algebras over a field. 


Theorem 4.7.8. Let A be an algebra over a field k. Then A 1s separable if and only if Ap 
is semisimple for all extension fields F of k. Moreover, when this holds, then A is finite- 
dimensional over k. 


Proof. A is finite-dimensional whenever it is separable or semisimple, by Proposition 
4.7.6 and Wedderburn’s theorem (BA, Theorem 5.2.4); so we may assume A to be 
finite-dimensional in what follows. Assume A separable; then by Theorem 4.7.4 all 
module extensions split, i.e. every A-module is semisimple, so A is semisimple. 


4.7 Hochschild cohomology and separable algebras 175 


By Proposition 4.7.7 Ax is also separable, so A; 1s semisimple for every extension F of 
k, 

Conversely, if Ar is semisimple for all F > k, take F to be an algebraic closure of k. 
Then A; is a direct product of full matrix algebras F,,. By Proposition 4.7.7, each F,, is 
separable and A; is separable, hence so is A. a 


We note that in this theorem it is enough to require A; to be semisimple for all 
finite field extensions of k. The theorem also shows that for finite-dimensional 
commutative algebras over a field the notion of separability introduced here reduces 
to that of BA, Section 11.6. For a commutative k-algebra A is separable in the sense of 
this section iff Ar is semisimple for all field extensions F of k, and by the form of the 
radical in Artinian rings (BA, Theorem 5.3.5) this just means that Ay is a reduced 
ring. 


Exercises 


1. Verify that a separating idempotent for A is indeed idempotent. (Hint. Use 
Proposition 4.7.2 and the fact that (1 — e)u = 0.) 

. Show that for any commutative ring K and any n > 1, K,, is separable by verify- 
ing that )/ e;; @ e), (where the e,; are the usual matrix units) is a separating 
idempotent. 

3. Let A be a separable algebra over a field k. Show that any right A-module M is a 
direct summand of M @, A and hence is projective. Deduce that A is semi- 
simple. 

4. Show that an algebra over a field k is separable iff it is semisimple and its centre 
is a direct product of separable field extensions of k. 

5. Show that a K-algebra A (over a commutative ring K) is separable iff the functor 
Mi— M" for any A-bimodule M is exact. 

6. Let k be a field of prime characteristic p and F = k(q@) a p-radical extension of 
degree p, where a? = ae k. Let A be the k-algebra generated by an element u 
with the defining relation (u? — a)” = 0. Show that J = J(A) is spanned by 
yv == u? — a and that A = A/J is semisimple, but A is not. Verify that A contains 
no subalgebra ~ A. 

7. Show that if E/k is a separable field extension and A is a commutative k-algebra, 
then A; is separable over A. 

8. Let K be a commutative ring and G a finite group whose order n is a unit in K. 
Show that the group algebra KG is separable, by verifying that (1/n) .g~' @g 
is a separating idempotent. 

9. Let E be a commutative separable K-algebra. Show that for any K-algebra A, 
el.dim(A @ E) = gl.dim(A). (Hint. Use the separating idempotent of E over K 
to define an averaging operator as in Exercise 8.) 

10. Use Exercise 9 to show that a K-algebra A is K-projective, i.e. an exact sequence 

of A-modules which is K-split is A-split. 


tO 


176 Algebras 


Further exercises on Chapter 4 


1. Show that if M, @ M, > N, @ N> and M, ~ N, then M> = N>. (Hint. Take the 
isomorphism in the form @ = (a@;;), where aj; : Mj — Nj; and use the fact that a 
and aj, are invertible.) 

2. If R is a Dedekind domain and a, 6 are any non-zero ideals in R, then 
a@b={R@ab (BA, Theorem 10.6.11). Use this result to show that the 
Krull-Schmidt theorem fails over any Dedekind domain which is not a principal 
ideal domain. 

3. Show that for any projective R-module P, the intersection of all maximal sub- 
modules is equal to JP, where J] = J(R). Give an example of a module for 
which this fails; can this module be finitely generated? 

4, Let R bea ring which can be written as a direct sum of indecomposable left ideals 
with simple tops. Show that R is semiperfect. Deduce that a ring is semiperfect iff 
every finitely generated module has a projective cover. 

5. Let R be a semiperfect ring. Show that any homomorphic image of an indecom- 
posable projective R-module is indecomposable. Does this remain true for more 
general indecomposable modules? 

6. Show that the endomorphism ring of any indecomposable injective module is a 
local ring and deduce a Krull-Schmidt theorem for injective modules. 

7. Show that finitely generated projective modules (over any ring) are isomorphic 
iff they have isomorphic tops. 

8. Let R be any ring and R,, the ring of all row-finite matrices (with countably 
many rows and columns) over R. Show that R < R,, but the R, R,, are not 
Morita equivalent. What is the relation between (i) the ring R,, of all row- 
finite matrices, (ii) the ring ,,R of all column-finite matrices and (ili) R,, 9 R? 

9. A ring R is called basic if R/J(R) is a direct product of a finite number of skew 
fields. Show that the ring of all upper triangular matrices over a local ring is 
basic. 

10. Show that for every semiperfect ring A there exists a basic ring B such that B ~ A 
and that B is unique up to isomorphism. 

11. A functor T is called faithfully exact if it is faithful, exact and preserves co- 
products. An object P in an abelian category .o/ such that h” = .</(P, —) is faith- 
fully exact is called faithfully projective. Show that P is faithfully projective iff P is 
a projective generator such that .o/(P. | [X;) = [| .(P. X;), for any family (X;) 
of .</-objects. 

12. Let P be a faithfully projective object in an abelian category «/. Write 
A=.VA(P.P), hX =.A7(P,.X) for X € Ob.s; verify that A is a ring and hX a 
left A-module, under composition of maps. Verify also that h is a functor 
from .</ to «Mod and use Theorem 4.4.1 to show that this is an equivalence 
of functors. 

Deduce that any abelian category with coproducts and a faithfully projective 
object is equivalent to a category of modules over a ring (Mitchell—Freyd 
theorem). 

13. A short exact sequence is called pure if it stays exact under tensoring. Show that a 
module M is flat iff every short exact sequence with third term M is pure. 


4.7 Hochschild cohomology and separable algebras 177 


14. 


15. 


16. 


17. 


18. 


19. 


Show that a short exact sequence of R-modules with first two terms M’, M is 
pure (M’ is pure in M) iff uA = p, where ue M"”, pe M"™, A&R" implies 
that w'A = p for some u’ € M"". 

Show that for any R-module U (over any ring R) the correspondence 
Ui U = Homz( U.K) is a faithful covariant functor. Show also that there is 
a natural transformation U1> U a which is an embedding. 

Show that a module M is flat iff for every homomorphism @ : P — M, where P 
is finitely generated projective and for every x € ker @ there is a factorization 
a= By, where 8: P > Q, y:Q—>M,Q finitely generated projective, such 
that xf = 0. 

Let A be a K-algebra and M an A-bimodule. Show that the sequence 


0—> M* > M > Der (A.M) > H'(A.M) > 0 


is exact, where Derg (A, M) = Homy.(Q2,M) is the module of derivations of 
A into M. 

(A. I. Malcev) Let B be an algebra over a field, with a nilpotent ideal I such that 
B/T is separable. If B= A, @I = A> QT are two splittings (Theorem 4.7.4), 
show that A,,A:) are conjugate by an automorphism of the form 


xto(l—u)~'x(l1—u), where ué€ J. (Hint. If ata; is an isomorphism 


B/I —> A,, examine the cochain f(a) = a, — a € I.) 
Show that the injective hull of a PID R (qua left R-module) is just the field of 
fractions of R. 


Central simple algebras 


Skew fields are more complicated than fields and much less is known about them. 
However, in the case of division algebras (the case of finite dimension over the 
centre) the situation is rather better. It is convenient to include full matrix rings 
over division algebras, thus our topic in Section 5.1 is essentially the class of 
simple Artinian rings. Although some of our results are proved in this generality 
we shall soon specialize to the finite-dimensional case over a field. 

There is no space to enter into such interesting questions as the discussion of 
division algebras over number fields, but we shall in Section 5.2 introduce an impor- 
tant field invariant, the Brauer group, show its significance for division algebras and 
in Section 5.3 describe some of their invariants. Then, after a look at quaternion 
algebras (Section 5.4), we introduce crossed products (Section 5.5). In Section 5.6 
we study the effect of changing the base field, and in Section 5.7 illustrate them 
on cyclic algebras. 


5.1 Simple Artinian rings 


It is clear from Wedderburn’s theorem that every simple Artinian ring is an algebra 
over a field; it also contains a skew field and is finite-dimensional over the latter, but 
it need not be finite-dimensional as an algebra. We begin by not imposing any finite- 
ness restriction, in fact we shall not even assume our rings to be Artinian, although 
not very much can be said in that generality. 

Let A be any ring and denote its centre by C. We may regard A as an A-bimodule 
or equivalently as a right A*-module, where A‘ = A® @A as in Section 4.7. The 
centralizer of A* acting on A, i.e. of all left multiplications 4, and right multiplica- 
tions p;, of A, is the intersection of the centralizers of all the A, and ,; this is the set 
of all left multiplications commuting with all left multiplications, i.e. multiplications 
by elements of C. Thus the centralizer of the action of A‘ on A is just C. If further, 
A is a simple ring, it is simple as A‘-module and by Schur’s lemma, C is then a field. 

Let k be a field. Given a k-algebra R and a right R-module M, the action of Ron M 
is said to be dense if, given x,,....x, € M and 6 € End,(M), there exists a € R such 
that 2<,6 =a es r). In this definition we may clearly omit any x; linearly 
dependent on the rest. Hence an equivalent definition is obtained by requiring 
that for any x)..... Xr V1, ---.¥r © M, where the x; are linearly independent over 


180 Central simple algebras 


k, there exists a € R such that y; = x;a. In particular, when M is finite-dimensional 
over k, this means that every k-linear transformation of M can be accomplished 
by acting with some element of R. The next result on dense action is basic in the 
study of simple algebras. 


Theorem 5.1.1 (Density theorem). The centre of any simple ring is a field. If A is a 
simple ring with centre k, then AS = A®° @ A is dense in the ring of k-linear transforma- 
tions of A. In particular, when |A : k] = n is finite, then 


AS = A° QA = Mk). (5.1.1) 


Proof. (Artin-Whaples, 1943) Since A is simple, k is a field. Let x),.-...: x, € A be 
linearly independent over k and }),.... y, € A; we have to find gy € A* such that 
y, = xiv. For r= 1 this follows by the simplicity of A, so let r > 1. By induction 
there exists y; € A such that xy: = 6, for i,j =2.....r. If mg, €k for some 1, 
then for suitable b € A, W= y.(b@ 1 — 1 @ 5B) satishes 1p #0, xiv = 0 for 7 > 1, 
and for some 6 € A‘, x; 6 = 1, hence y, = Wé maps x, to } and the x;(j > 1) to 
0. Now gi =y%,— (1 @x\y,) maps x, to | and the other x’s to 0, hence 
y = > gi(1 @ y;) is the required map. 

There remains the case where A,=x,y9;€k for i1=2..... r. Put 
w= >> ¢)(1 @x;) — 1; then yyw =0 for j=2....., mya >> Aix; — xX). By the 
linear independence of the x’s this is not zero and for a suitable 6 € A‘, x; @ = 1; 
now the proof can be completed as before. 

In the finite case we have a surjective homomorphism A* — End,(A) & k,, and 
now the isomorphism (5.1.1) follows by counting dimensions, which are 1° on 
both sides. a 


This result will be proved again in a more general context in Section 8.1 below. 
That (5.1.1) is an isomorphism also follows from the fact, soon to be proved (in 
Corollary 5.1.3) that A* is simple, for any simple ring with centre k. 

Let A be a k-algebra, where k is a field. Then for any a € k,a,b € A we have 


a(ab) = (aa)b = a(ab). 


Taking b = 1], we see that wa = aa; thus the mapping ai a@.1 is a homomorphism 
of k into the centre of A; since k is a field, this is actually an embedding whenever 
A # 0. Conversely, any ring whose centre contains k as a subfield may be considered 
as a k-algebra in this way. Thus a non-trivial k-algebra is essentially a ring whose 
centre contains k as a subfield. If the centre of A is precisely k, A is called a central 
k-algebra. Throughout this section k will be a field, fixed but arbitrary. Our conven- 
tion will be that an algebra or k-algebra need not be finite-dimensional, but a division 
algebra is finite-dimensional over the ground field. When the dimension is infinite, 
we Shall speak of a skew’ field. 

Consider the functor A @; —; we shall show that it preserves the ideal structure 
when A is a central simple k-algebra. 


5.1 Simple Artinian rings 181 


Theorem 5.1.2 (Azumaya—Nakayama, 1947). Let A be a central simple k-algebra. 
Then for any k-algebra B there is a lattice-isomorphism between the ideals of B and 
those of A® B. In particular, the ideal lattice of 9N,(B) 1s tsomorphic to that of B, 
foranyn> 1. 


Proof. Consider the mappings 


bi~A@bh (6 an ideal of B) (5.1.2) 


Ci>E€OB (€ an ideal of A@B), hed) 


where B is embedded in A @ B by the natural mapping x1 1 ® x. Since k is a field, 
all these tensor products are exact, so we do have such an embedding. We claim that 
the mappings (5.1.2), (5.1.3) are mutually inverse; this will prove the result, for they 
establish a bijection which is clearly inclusion-preserving and hence a lattice iso- 
morphism. The last part then follows because k,, is central simple and B,, = k,, @ B; 
of course it is also a consequence of Corollary 4.4.6. 

We recall the intersection formula in a tensor product (BA, Section 4.8): If U, V 
are k-spaces and U=U'@U",V=V'@V’, then 


U' QVNUQV’ =U'@V'. 


Since k is field, every subspace is a direct summand. Now if b is an idea] in B, then 
A ® bis an ideal in A @ B and so we have 


(A@Qb)NB=b. (5.1.4) 


This holds even for left ideals, hence (5.1.2) is injective for any one-sided ideal b. 

Next let € be an ideal in A @ Band put b = €M B; then A @b C Cand we have to 
establish equality here. Any c € € can be written in terms of a basis u; of A as 
c=) >u, @z;,, where z; € B. Only finitely many of the z; are non-zero, say z; # 0 
for i= 1,...,r. By Theorem 5.1.1, A® acts densely on A, hence there exist xj, 
y,EA (G=1.....s) such that Dojaxjayj;= 8) GG=1L..., r). It follows that 
xo = dj xij @2=1@z2 EC; so 2z,¢€b and similarly 2,€6 for 
= 2,...,r. Therefore € =A @b and this is the desired equality. Thus (5.1.2), 
(5.1.3) are mutually inverse and the lattice isomorphism follows. 


We observe that no finiteness assumptions are needed here. The theorem has a 
number of important consequences. 


Corollary 5.1.3. If A is a central simple k-algebra and B 1s any k-algebra, then A ® B 1s 
simple if and only if B is simple; further the centre of A ® B is isomorphic to that of B. 
In particular, the tensor product of central simple k-algebras is again central simple. 


Proof. The assertion about the centre follows because the centre of a tensor product 
is the tensor product of the centres, as is easily verified (see BA, Corollary 5.4.4). The 
rest is a consequence of Theorem 5.1.2. Oo 


182 Central simple algebras 


Sometimes we shall want to know when a given division algebra over k can be 
embedded in a matrix ring over a skew field D containing k in its centre. There is 
a simple answer when the centre of D is a regular extension of k; it is given by the 
following result, taken from Schofield (1985) (a field extension F/k is regular if 
E ® F is an integral domain for any field extension E/k). 


Proposition 5.1.4. Let D be a skew field whose centre F is a regular extension of k, and 
let A be a simple Artinian k-algebra. Then A° ®; D is a simple Artinian ring, with a 
unique simple module S which is finite-dimensional over D, say |S: D] =s, and A 
can be embedded in IN, (D) if and only if s|n. 


Proof. Let C denote the centre of A. We have 
A® @, D& A® @¢ (C @; D), 


and A is central simple over C; hence by Theorem 5.1.2, the ideals of A° @ D corre- 
spond to those of C @; D. Next we have 


CQ, D=(C @, F) @- D, 


so the ideals correspond to those of C ® F. By hypothesis this is an integral domain; 
since [(C @, F: F} = [C: k], which is finite, C @ F is a field, and it follows that 
A°® ® D is simple. It is Artinian because its dimension over D is finite. This also 
shows that the unique simple (A° © D)-module S has finite dimension over D. 
Suppose now that A is embedded in D,,; we may regard D,, as the endomorphism 
ring of D”, qua right D-module. In this way D” becomes an (A, D)-bimodule, i.e. a 
right (A° @ D)-module. As such it is isomorphic to S’ for some r > 1, and a com- 
parison of dimensions shows that 1 = rs. Conversely, if 1 = rs, then D" = S" and 
this is an (A, D)-bimodule, hence A is embedded in D,,. Ci 


We note that in this result the regularity assumption can be omitted when A is a 
central k-algebra. 

We shall want to know when a given k-algebra can be written as a tensor product. 
First we shall look at the general case; in the finite-dimensional case we can then 
easily obtain a complete answer. 


Proposition 5.1.5. Let P be a k-algebra with a central simple subalgebra A, whose 
centralizer in P is denoted by A’. Then the subalgebra of P generated by A and A’ 1s 
their tensor product over k. 

If further [A :k] ts finite, then P= A @A and the centre of A’ is the centre of P. 


Proof. By hypothesis, A and A’ commute elementwise, so the mapping (x,y) > 
xy(x € A.y € A’) gives rise to a homomorphism 


AQA => P. (5.1.5) 


Its kernel is an ideal in A @ A’, which by Theorem 5.1.2 is of the form A © a, where a 
is the kernel of the restriction of (5.1.5) to A’. But this is the inclusion mapping, 


5.1 Simple Artinian rings 183 


which is injective, so a = 0 and (5.1.5) is injective. Clearly its image is the subalgebra 
generated by A and A’. 

Suppose now that |A:k] = 71 is finite. We can regard P as A‘-module, i.e. by 
Theorem 5.1.1 as k,-module. This module is semisimple, hence P = @P,, where 
P; is simple, isomorphic to A. Let u; € P;. correspond to | in this isomorphism; 
then ua = au; for all ae A, hence u; € A’ and so P= AA’. Therefore (5.1.5) is 
surjective, hence an isomorphism in this case. Now the assertion about the centre 
follows by Corollary 5.1.3. | 


In particular we find the identity 
ky =k, Ok. (5.1.6) 


which is also easily verified directly. We remark that in general A @ A’ will be a 
proper subalgebra of P; for example, if P = k(x, y.z), the free algebra on x,y,z 
and A is the subalgebra generated by x, y, then P and A both have centre k and 
A’ =k, so AA’ # P. 

We recall from field theory the theorem of the primitive element: A finite separ- 
able extension F/k can be generated by a single element over k (see BA, Theorem 
7.9.2). Of course a noncommutative algebra cannot be generated by a single element, 
but as we shall now see, in many cases two elements suffice: 


Proposition 5.1.6. Let D be a central division algebra over a field k and let F be a 
maximal separable subfield. Then there is an element u in D such that D = FuF; in 
particular, D can be generated by two elements over k. 


Proof. Let M(D) be the multiplication algebra of D, generated by the left and right 
multiplications 4, and p, resp. for a € D. Writing D° for the opposite ring of D, 
we have a homomorphism D® @ D — M(D) mapping a @ b to 4,,p5; it is surjective 
by definition and since D® @ D is simple, it is an isomorphism. Restricting a and b 
to F, we obtain a faithful action of F ® F on D. Now F 1s separable, so (by BA, 
Corollary 5.7.4) we have F@F = E£, x ..x E,, where n= [F:k| and the E; are 
fields (composites of F with itself, hence isomorphic to F). Let e, be the element 
of M(D) corresponding to the unit element of E, and choose u, € D such that 
ue; # 0. If we now write u = )— uje;, then the map 5 a, © b, I> = uAg, po, 18 injec- 
tive, because «is not annihilated by any £,. A comparison of dimensions shows that 
it is also surjective, and so D = FuF. As a separable extension F/k is generated by a 
single element, c say, hence D is generated by u and c over k. Ei 


We next come to a basic result in the theory of central simple algebras, the 
Skolem—Noether theorem, which asserts that every automorphism of a finite- 
dimensional central simple algebra is inner. It is useful to have a slightly more 
general form of this result: 


Theorem 5.1.7. Let A be a simple Artinian ring with centre k and B any’ fintte- 
dimensional simple k-algebra. Given any homomorphisms f\, f: from B into A, there 


184 Central simple algebras 


exists a unit u of A such that 


bf, = u~'(bf,)u forall b € B. (5.137) 


Proof. We regard A as (A° @, B)-module, i.e. as (A, B)-bimodule in two ways: 
(a,x, b); = ax(bf;) wherea,x € A,be B,1= 1,2. (5.1.8) 


By Corollary 5.1.3, A° @ B is simple, and it is Artinian, for if A = D,, where D is a 
skew field and t > 1, then A° ® B = (D° @ B), and this has finite dimension over 
D°. If V is the unique simple (A° @ B)-module, then every finitely generated 
(A° ® B)-module has the form V" for some r > 0, hence the (A° ® B)-module struc- 
tures defined on A by (5.1.8) are isomorphic to V". V* respectively. By comparing 
dimensions over D°® (i.e. regarding A as left D-module), we see that r = s, hence 
the structures are isomorphic. This means that there is a bijective linear transforma- 
tion y: A — A such that 


[ax(bf} iy = a(xy)(bf2) wherea,x € A, be B. (5.1.9) 


Putting b = 1 = x and writing ly = u, we find that ay = au for all a € A. Since y is 
surjective, there exists v€A such that vu=1; but y is also injective and 
(uv — 1l)y = uvu — u = 0, hence uv = 1 and so v= u7~'. If we now puta=x = 1 
in (5.1.9), we obtain (bf, )u = u(bfs) and (5.1.7) follows. B 


Two subalgebras or elements are said to be conjugate if there is an inner auto- 
morphism mapping one to the other. From Theorem 5.1.7 we immediately find 


Corollary 5.1.8. In any simple Artinian ring with centre k, isomorphic finite- 
dimensional simple k-subalgebras are conjugate and hence have conjugate centralizers. J 


In particular this shows that every automorphism of a central simple finite- 
dimensional algebra is inner (Skolem—Noether theorem). More generally we have 


Corollary 5.1.9. Every automorphism of a finite-dimensional semisimple k-algebra 
which leaves the centre elementwise fixed, is inner. 


Proof. Let 6 be the automorphism, write A = A, ®... ®A,, where each A; is simple 
and denote the unit-element of A, by e;. Then e, is in the centre of A, and so e,0 = ¢;, 
by hypothesis. It follows that 4 maps each A; into itself: if a¢A,, then 
a@ = (ae;)@ = a6.e,;, therefore ab € A;. Thus 6 induces an endomorphism of Aj, like- 
wise §~!, so @ in fact defines an automorphism of A;, 0; say. Now the centre of A;, C, 
say, is left fixed by 6; so we may regard 6; as an automorphism of the central simple 
C;-algebra A;. By Corollary 5.1.8 there exists a unit 4; in A; such that xO; = u> leu; 
(x € A;). Now it is easily checked that u = }°u, is a unit of A inducing @. Gi 


We remark that Corollary 5.1.8 as it stands does not extend to the case of semi- 
simple subalgebras (see Exercise 5). 


5.1 Simple Artinian rings 185 


We next look at the relation between the dimension of a simple subalgebra and 
that of its centralizer. 


Theorem 5.1.10 (R. Brauer, 1932). Let A be a simple Artinian ring with centre k and 
B a finite-dimensional simple subalgebra with centre F. Then the centralizer B’ of Bin A 
is again simple with centre F and the centralizer B” of B' equals B, while the centralizer 
F’ of F is given by 


F =B@B’. (5.1.10) 
Moreover, 
[At B= (Bek. (5.1.11) 
and if {[B:k] =r, then 
A@B°=B Qk, =B.. (5.1.12) 


Proof. We may regard k, as acting on B & k' by k-linear transformations. As such it 
contains the subalgebras p, of right multiplications and Ag of left multiplications. 
Clearly ox = B, Ax = B® and A@k, is central simple, by Corollary 5.1.3. We have 
the isomorphic simple subalgebras B® k, k ® pg; they are conjugate by Corollary 
5.1.8 and so have isomorphic centralizers: 


B’' Qk, ZABAzRZMAQB. 


Since B’ @ k, = B’, this proves (5.1.12), and comparing dimensions over B’, we find 
that r° = [A : B’][B: k], from which (5.1.11) follows on dividing by r. 

Now A @ B® is simple, hence by (5.1.12), so is B’ (using Corollary 5.1.3 twice). 
Clearly B” > B, B’ =B’, and replacing B by B” in (5.1.11), we find that 
[Bo =k) 7, hence: B= B, 

Finally, if the centre of B’ is E, then E > F and since B” = B, we also have F D E, 
hence E = F. Thus B, B’ are central simple F-algebras, both subalgebras of F’, and 
now (5.1.10) follows from Proposition 5.1.5. | 


We observe that B @ B’ is in general distinct from A, for the two sides have centres 
F and k respectively. Only when F = k do we have F’ = A and the above result then 
reduces to part of Proposition 5.1.5. 


Corollary 5.1.11. Let A be a simple Artinian ring with centre k and let F be a subfield of 
A such that F Dk and [F:k| =r 1s finite. Then 


AQ, F & F’ @, k,. 


If moreover, A has finite dimension n over k, then r-|n and writing B = F’, we have 
A &r B® = Phe: 


Proof. This is just the case B = F of Theorem 5.1.10; here [F’: F] = n/r-. a 


186 Central simple algebras 


Corollary 5.1.12. Let A be a finite-dimensional central simple k-algebra and F a sub- 
field of A containing k. Then the following are equivalent: 


) F =F, 

) Fis a maximal commutative subring of A, 
) [A:k) =[F:kP. 
) 


a 
b 
C 
d) [A:k) =[A: FT}. 


( 
( 
( 
( 
In particular, every maximal commutative subfield F of a central division algebra D 
satisfies [D:k] =[F:k]” =[D: FI-. 


Proof. Clearly F is a maximal commutative subring of A iff F’ = F and F is a subfield 
by hypothesis. Now for any subfield F,F’>F and [A:k]) =[A:F][F:k] = 
[F:k][F’:k], hence [A: F]° > [A:k] > [F: k]°, with equality in either place iff 
ae? op o 


We remark that a central simple algebra A may have no subfield F satisfying the 
conditions of Corollary 5.1.12, eg. A=k,, where k is algebraically closed and 
n > 1. Nevertheless, the dimension of a central simple algebra is always a perfect 
square, for by Wedderburn’s theorem, A = D,, where D is a skew field, again with 
centre k. If F is a maximal subfield of D, then by Corollary 5.1.12 applied to D we 
find that [D:k] = [F: k]°, hence [A : k] = n°[F:k]~. In the next section we shall 
meet another proof of this important fact. 

As another application of Theorem 5.1.10 we have Wedderburn’s theorem on 
finite fields. This was proved in BA, Theorem 7.8.6; below is another proof. We 
shall need the following remark about finite groups. 


Lemma 5.1.13. Let G be a finite group and H a proper subgroup. Then G cannot be 
written as the union of all the conjugates of H. 


Proof. Let |H| = h, (G: H) =n, so that |G| = hn. Ifa,..... a, is a right transversal 
for H in G, then each conjugate of H has the form a, ' Haj. There are 1 such 
conjugates and each contains h elements, but the unit element is common to all 
of them, so their union contains at most (h—1)n+1 elements in all. Since 
n > 1, this is less than hn = |G], so not every element of G is included. | 


Theorem 5.1.14 (Wedderburn’s theorem on finite fields). Any finite skew field is 
commutative. 


Proof. Suppose that D is a finite skew field; let & denote its centre and let F be a 
maximal subfield. Then F is a finite field; all maximal subfields of D have the 
same degree r, say, over k (Corollary 5.1.12) and hence are isomorphic, as minimal 
splitting fields of x? — x where q = |k|. By Corollary 5.1.8 they are conjugate to F. 
Now each element of D lies in some maximal commutative subfield of D, so D is the 
union of conjugates of F. It follows that the multiplicative group D” is a finite group, 
equal to the union of the conjugates of F”. But this is impossible by Lemma 5.1.13, 
hence D must be commutative. || 


5.2 The Brauer group 187 


Exercises 


l. 


10. 


Show that every finite-dimensional central simple algebra over a finite field F has 
the form 9%,,(F), for some n > 1. 


. Let A bea finite-dimensional k-algebra. Show that if Az & E,, for some extension 


field E/k and some n > 1, then A is central simple over k. 


. Let D be a skew field with centre k and let E be a finite-dimensional subalgebra 


(necessarily a skew field). Show that if E’ is the centralizer of FE, then the 
centralizer of E’ is Eand [D: E’] =[E: k]. 


. Let C be a finite-dimensional k-algebra and A a central simple subalgebra. Show 


that [A : k] divides [C: k]. 


. Show that Corollary 5.1.8 no longer holds for semisimple subalgebras. (Hint. 


Take appropriate diagonal subalgebras isomorphic to k- of k3.) 


. Let R, S be k-algebras, where k is a field, and let pU, Vs be modules as indicated, 


where R acts densely on U with centralizer k. Show that for 0 # uq € U the map- 
ping v i> uy © v embeds V in U @, V. Construct a lattice-isomorphism between 
S-submodules of V and (R. S)-subbimodules of U ® V. 


. Let D be a skew field with centre k and let F be a maximal subfield of D. Show 


that D, is a dense ring of linear transformations on D as F-space. Show that if 
either [D: F] or [F : k] is finite, then so is the other and D, is then a full matrix 
ring over F. (Hint. Use the regular representation of D to get a homomorphism 
D ®, F — F,,, where 1 is the dimension of D as right F-space.) 


. Let 6 be a derivation of a central simple algebra A. By representing 6 as an 


isomorphism of (triangular) subalgebras of 90t:(A) show that 6 is an inner 
derivation. 


. (Wedderburn, 1921) Let D be a skew field with centre k. Show that any two 


elements of D with the same minimal equation over k are conjugate. 

(A. Kupferoth) Let D be a skew field with centre C and K a subfield with centre 
F, Show that [K : F] < |D:C] and when both are finite and equal then F C C 
and D= K @; C. 


. Show that if A is a finite-dimensional central simple k-algebra and B is any 


k-algebra, then A @ B is semisimple iff B is. 


. Let D be a skew field with centre k and FE a skew subfield with centre C. Show 


that E and the subfield generated by C and k are linearly disjoint over C. 


5.2 The Brauer group 


In this section all algebras will be finite-dimensional over the ground field k. 


Let A be a central simple k-algebra. We know that A is a full matrix ring over a 


skew field: 


jes Dir = D® ky. 


Here m is unique and D is a central division algebra over k, unique up to k- 
isomorphism. We shall call D the skew field component of A. Two central simple 
k-algebras A, B are said to be similar, A ~ B, if their skew field components are 


188 Central simple algebras 


isomorphic. Clearly this is an equivalence relation; we shall denote the class of A by 
(A). We now show that tensor multiplication induces a multiplication of these 
equivalence classes. 

If A, B are central simple k-algebras, then so is A ® B, by Corollary 5.1.3. More- 
over, if A~ A’, B~ B’, say A= C@k,,, B= D@k,, where C, D are skew fields, 
then A@BZXY(C@k,,) @(D@k,) X~CQ@DOk,,,, hence A®B~ C®@D, and 
similarly A’ ® B' ~ C@D, whence A®B~ A’ @B. The multiplication of simi- 
larity classes is associative and commutative, by the corresponding laws for tensor 
products. Moreover, for any A, we have A@k=2A and A@A’ =k, for some 
n> 1, hence 


(A)(k) = (A). (A)(A") = (4). 


This shows that the class (k) is the neutral element for multiplication and (A“) is the 
inverse of (A), and it proves 


Theorem 5.2.1. For any field k, the similarity classes of finite-dimensional central 
simple k-algebras form an abelian group with respect to the multiplication induced by 
the tensor product. 

We still need to check that the collection of all classes is actually a set; this is easily 
seen if we observe that the central simple algebras are finite-dimensional overk. [J 


The group so obtained is called the Brauer group of k and is written B,; its 
elements are the Brauer classes of central simple k-algebras. The Brauer group is 
an invariant of the field k which provides information about the central division 
algebras over k. Later, in Section 5.5, we shall meet another description of B,, as a 
cohomology group. 

As an example take an algebraically closed field F. If D is a division algebra over F, 
then for any a € D, F(a) is a finite extension field, hence F(a) = F because F is 
algebraically closed, and so a € F, i.e. D= F. Thus there are no division algebras 
over F apart from F itself, and we conclude that By = 1, ie. the Brauer group of 
an algebraically closed field is trivial. Of course once we drop the restriction on 
the dimension, we can find skew fields with centre F, see Section 7.3 and Section 9.1. 

For a closer study of B,; we need to examine the behaviour of algebras under 
ground field extension. Let A be a (finite-dimensional) k-algebra and F an extension 
field of k (not necessarily finite-dimensional over k). Then the F-algebra defined by 


Ap = i7\ ®y F 


is again finite-dimensional; in fact any k-basis of A will be an F-basis of Az, so that we 
have 
[A; : F] = [A: ky]. (5.2.1) 


We note that x > x @ I defines an embedding of A in A;. If A has centre k, then A; 
has centre F, by Corollary 5.1.3. From this fact we can deduce 


5.2 The Brauer group 189 


Proposition 5.2.2. If A is a finite-dimensional central simple k-algebra, then 
[A: k] =r- for some integer r, and there is a finite extension F of k such that 
Az & F,; hence A can be embedded in F,. 


Proof. If E is an algebraic closure of k, then Ag is a full matrix algebra over E, say 
Apt & E,. Comparing dimensions and remembering (5.2.1), we find that {A : k] = 
[Ag : E] =[E,: E] = r-. Now there is an embedding 


A> Ag = E,. (5.2.2) 


by the above remark. It remains to show that we can replace E by a finite extension. 
Let u),..., u, be a basis of A and U,,..., U,, the matrices over E which correspond 
to the u's under the mapping (5.2.2). Then we can express the matrix units in E,. as 
= >. Ci Ue for some @j;,, € E. Denote the finite set of entries of the U,. and the 
aj; by X. Since E is algebraic over k, X generates a finite extension F of k and it is 
clear that A; & F,. Since A is embedded in A;, it can also be embedded in F,. 


We note that this result could not be deduced from Proposition 5.1.4, because 
here F/k is never regular. Proposition 5.2.2 provides no explicit bound on [F: k], 
but we shall soon meet such a bound, in Corollary 5.2.7. 

For any central simple k-algebra A the integer ,/{A : k] is called the degree of A, 
written Deg A. If A D@k,,, then Deg A = m(Deg D); clearly the degree of D is 
also an invariant of A, called the Schur index, or simply the index of A. It is a measure 
of how far A deviates from being a full matrix algebra over &. We note that the index 
is an invariant of the Brauer class of A, while the degree determines A up to 
isomorphism within its Brauer class. 

If F is an extension field of k, then the mapping A I— A; induces a mapping of 
Brauer classes, for if A~ D@k,,, then Ar ~D@®k,, @F = DOF,,, so that 
(Ar) = (D;). The mapping is a homomorphism, for (A @; B)rp =AQ, BQ. F = 
Ar @¢ Br, so we have a group homomorphism 


B,; — Be. 


Its kernel B(F/k), the relative Brauer group, consists of those classes (A) over k for 
which A; & F,, for some m. Such a class is said to be split by F and F is called a 
splitting field for this class, or for the algebra A. If Ap} = AQF] F,,, then on 
taking anti-isomorphisms, we find that A° ® F = F,,, hence F splits A iff it splits 
A®. Any central simple k-algebra has a splitting field, which may be taken of finite 
degree over k, by Proposition 5.2.2. 

Next we examine more closely which fields split a given Brauer class, but first we 
establish a relation between the indices of the extensions. 


Proposition 5.2.3 (Index reduction lemma, A. A. Albert). Let F/k be a finite field 
extension, say [F : k] =r, and A a central simple k-algebra. If A, Ay have indices m, 
u, then u|m and mir. 


190 Central simple algebras 


Proof. We may take A = D to be a division algebra, without loss of generality. If the 
skew field component of Dr is denoted by C, then 


D@F=D;=C®, F,; (5.2.3) 


comparing dimensions over k, we find m°r = y*q°r, hence m = pq and so pm. 

Secondly the regular representation of F (by right multiplication) defines an 
embedding of F in k,, because F = k" as k-space. Thus F is embedded in k, and 
hence D @ F is embedded in D@k,. By (5.2.3) we obtain an embedding of F, 
in D@k,, but F >k, so D@k, contains k, as central simple subalgebra. If the 
centralizer of k, in D @k, is denoted by B, then by Proposition 5.1.5, 


D@k, =k, @BYk, @G@K,, 


where G is the skew field component of B (in fact G & D by uniqueness). Comparing 
dimensions, we find that r = qs, thus m/u = q|r, Le. mir. Go 


The factor q in m = yq is called the index reduction factor; we note that q=r 
whenever F is isomorphic to a subfield of D. 
The next corollaries are immediate consequences of Proposition 5.2.3. 


Corollary 5.2.4. Let A be a central simple k-algebra and F/k a field extension. If |F : k] 
is prime to the index of A, then A and A, have the same index. In particular, if A is a 
division algebra of degree prime to [F : k], then A, is again a division algebra. ol 


Corollary 5.2.5. The degree of a splitting field (over k) of a central simple k-algebra A is 
divisible by the index of A. a 


We now obtain a criterion for a given finite extension field of k to split a given 
Brauer class over k. 


Theorem 5.2.6. Let w € B, and let F be an extension field of k, of degree r. Then F splits 
w if and only if some algebra in w contains F as a maximal subfield; this algebra neces- 
sarily has degree r. 


Proof. Let D be the division algebra in the class w and let 1 be the least integer such 
that F can be embedded in D,,. Then the centralizer F’ of F in D,, is a skew field, by 
the minimality of n, and F is the centre of F’. By Corollary 5.1.11, 


DprA DOF SF GO. hoe k, (5.2.4) 
Since F’ has centre F, this shows that F splits D iff F’ = F, i-e. iff F is a maximal sub- 
field of D,,. i 
Corollary 5.2.7. Let D be a central division algebra over k. Then any maximal subfield 
of D ts a splitting field for D. 


Proof. By Corollary 5.1.12 any maximal subfield F satisfies [D : k} = [F : k]”, so the 
theorem may be applied. gi 


5.2 The Brauer group 191 


The next step is to show that the splitting field of a central simple algebra can 
always be taken to be a separable extension of the ground field. We prove more 
than this, namely that we can actually find a separable splitting field in the skew 
field component. More precisely, the proof below shows that every algebraic skew 
field extension contains a separable element. 


Theorem 5.2.8 (Kothe, 1932). Every central division algebra D over k contains a 
maximal commutative subfield (hence splitting D) which is separable over k. 


Proof. (Herstein) Clearly we may assume that char k = p 4 0. Our first task will be 
to find a separable extension F of k in D. Since [D: k] is finite, each element of D is 
algebraic over k; if some a ¢ k is separable over k, then k(a) is the required extension. 
Otherwise there are no separable extensions of k, so each element of D is p-radical 
over k, say x? €k for some r = r(x). Hence we can find a ¢k such that a? € k. 
Denote by 6 the inner derivation induced by a.6:x!-+xa—ax. Then xd? = 
xa? —a?x =0, but of course 640, because a¢gk. Choose b€D such that 
c=b540, bb? =0. Then c5=0= a6, so if u=be~'a, then ud =cc~la=a. 
Writing this out, we have ua — au =a, ie. u=1+aua™', but ui €k for some 
q=p*, hence ut = 1+ (aua~')4=1+u4 (because u! € k). We obtain | =0, a 
contradiction. It follows that D contains a proper separable extension. 

Taking a separable extension of maximal degree in D, we obtain a maximal separ- 
able extension F. By Theorem 5.1.10, its centralizer F’ is simple with centre F, but as 
centralizer in a division algebra F’ is itself a division algebra, so if F’ # F, then F has 
a proper separable extension E, by the first part. But then E is separable over k, which 
contradicts the maximality of F. Hence F’ = F, ie. F is a maximal subfield and by 
Corollary 5.2.7, a splitting field of D, separable by construction. + | 


We remark that the result holds for any skew fields that are algebraic over k but 
not necessarily finite-dimensional. For we can use Zorn’s lemma instead of a dimen- 
sion argument to obtain a maximal separable extension. 


Corollary 5.2.9. Every Brauer class of k has a splitting field which 1s a finite Galois 
extension of k. 


Proof. Take a division algebra D in w € B, and let F be a maximal separable exten- 
sion in D. Then F is contained in a finite Galois extension of k, and this will also 
split D. Ea 


Of course the Galois splitting field need not be contained as a subfield in D. Later, 
in Section 5.5, we shall find that splitting subfields that are Galois lead to the crossed 
product construction. 

A basic question in the theory of central simple algebras is this: when is the tensor 
product of two division algebras again a division algebra? In essence this is a question 
about the index of a tensor product, and little explicit information is available. We 
first take a simple case, where there is a complete answer. 


192 Central simple algebras 


Proposition 5.2.10. If C, D are two central division k-algebras of coprime degrees, then 
C @ D is again a division algebra. 


Proof. By Corollary 5.1.3, A = C @ Dis central simple, say A & K,, for a skew field K. 
If the simple A-module is denoted by V, then A = V"; here V is isomorphic to a 
minimal right ideal, hence a C-space, and so n|[A:C]=[D:k]; similarly, 
ni{C:k],son=1. Ea 


Our next result provides a description of the index of a tensor product, even when 
only one of the factors has finite degree. We remark that if R is a simple Artinian ring 
and R, = D,, where D is a skew field, then r|n, say n= rs and R = D,. For by 
Wedderburn’s theorem, R has the form R = K, where K is a skew field. It follows 
that K,, = D, and by uniqueness, n = rs and K & D, and so R & D,. 


Theorem 5.2.11. Let F/k be a field extension of degree d, and let C, D be skew fields 
with centres k, F respectively. If either C or D 1s of finite degree, then 


CQ, DX Gy. (5.2.5) 
where G is a skew field with centre F. Moreover, 
(i) if Deg C =r, then 
D, = C° @, G, where mq = a (5.2.6) 


and q 1s the least integer such that C° can be embedded in D,; 

(ii) ifDeg D=s, then D® can be embedded in C,, for n = s°d/m but no smaller n, the 
centralizer of D® in C,, is isomorphic to G and if F’ denotes the centralizer of F in 
C,,, then 


F’S D’° @-G, and mn= sd. (5.2.7) 


In particular, in case (i) q|r* and C @ Dis a skew field if and only if q = r-, while 
in case (ii) n|s°d and C @ D is a skew field if and only if n = sd. 

(ili) When both C, D have finite degrees r, s respectively, and n, q are as in (i), (ii), then 
n, q are related by the equation 


nr” = gs-d, (5.2.8) 
and Deg G = t, where 
t= sq/r=rn/sd, m=r'/q=s-d/n. (5.2.9) 


From (1), (11) it is clear that r, s, d are independent, while n, q are related as in (5.2.8), 
and m, t are determined by (5.2.9) in terms of r, s, d, n, q. 


Proof. The algebra C @ D is simple with centre F, by Theorem 5.1.2. If C has finite 
degree, then C @ D is finite-dimensional over D; if D has finite degree, C ® D is 
finite-dimensional over C. In either case it is Artinian and by Wedderburn’s theorem 
it has the form G,,, for some m, where G is a skew field with centre F. 


5.2 The Brauer group 193 


(1) Suppose now that Deg C = r; then C and hence C° can be embedded in k,;: 
and hence in D,-. Let q be the least integer such that C can be embedded in D, as 
k-algebra. Then by Proposition 5.1.5, D, = C° @, E, where E is a simple algebra 
with centre F, by Theorem 5.1.2. Moreover, E is Artinian, for if a is a left ideal, 
then C° @a is a left D-space, hence the length of chains of left ideals is bounded 
by q. Thus E is a matrix ring over a skew field. Taking Brauer classes we have 
(E) = (C)(D) = (G), therefore E = G, for some h, but if h > 1, we can replace q 
by q/h, which contradicts the minimality of q. Hence h= 1 and D, = C° @G. It 
follows that Dy, = C°@G, =C°@C@DZk: @D=D,: and so qm=r’; 
thus (5.2.6) is established. 

(ii) Next assume that Deg D = s; then D and with it D® can be embedded in F,: 
and hence in k,:4. Let be the least integer for which D° can be embedded in C,, and 
denote by H the centralizer of D° in C,,. Then by Theorem 5.1.10, F’ = D° @f H, 
where F’ is the centralizer of F, and H is simple with centre F, while 
C, @, D& Ay, by (5.1.12). Comparing this relation with (5.2.5), we see that 
s°d = mn and H &G. Finally when both C and D have finite degrees, then by 
combining (5.2.6) and (5.2.7) we see that m = r°/q = s*d/n, therefore nr* = qs°d, 
and if Deg G=t, then a comparison of degrees in (5.2.5), (5.2.6) yields 
t = sq/r = rn/sd. B 


Consider the special case when D = F. Then s = 1 and mn = d, so we obtain a 
result essentially contained in Proposition 5.2.3: 


Corollary 5.2.12. Let C be a skew field with centre k and F a finite extension of k. Then 
Cr = Gy, for some skew field G, where m divides |F : k], with equality if and only if F 
can be embedded in C. = | 


We remark that here C need not be finite-dimensional over k. 


Examples of Brauer groups 


1. We have already seen that the Brauer group of an algebraically closed field is 
trivial, e.g. Bc = 0. More generally, this holds for any field which is separably 
closed, by Theorem 5.2.8. 

2. The Brauer group of any finite field is trivial. For any division algebra over a finite 
field F is itself a finite skew field, hence commutative by Theorem 5.1.14, so it 
reduces to F. This case will be generalized in the next section. 

3. The Brauer group of the real numbers has order 2. For the only algebraic exten- 
sions of R are R, C, so any division algebra has degree | or 2, and as we shall see in 
Section 5.4, the only algebra of degree 2 is the algebra of quaternions. 

4. If Fis a complete field for a discrete valuation with finite residue class field (e.g. 
the p-adic field Q,), then Br = Q/Z. 

5. If F is an algebraic number or function field, then Bg is a subgroup of the direct 
sum of the Brauer groups of the corresponding local fields, described in examples 


194 Central simple algebras 


3 and 4 above (see Weil (1967)or Reiner (1976)). More precisely, we have an exact 
sequence (Hasse reciprocity) 


0 Br > ©Br, > Q/Z—> 0, 


where F, are the completions of F. 


Exercises 


1. Define Brauer classes for any central simple algebra, not necessarily finite- 
dimensional, and show that these classes form a monoid whose group of units 
is By. 

2. In Proposition 5.2.3 show that q is the degree of the largest subfield common to F 
and D. 

3. Show that any skew field p-radical over its centre is commutative. 

4. Prove Theorem 5.2.8 in detail for skew fields algebraic over k. 

5. Let D be a central division k-algebra. For any automorphism a@ of D as a skew field 
define the inner order as the least r such that a’ is inner. Show that the inner order 
of any such a divides the order of the restriction ak. 

6. Show that a central simple algebra of degree n is split iff it has a left ideal of 
dimension n. 


5.3 The reduced norm and trace 


In BA, Section 5.5, we met the notions of norm and trace of an element in a finite- 
dimensional k-algebra. We recall that for an m-dimensional algebra A with basis 
H}....,U,, the right multiplication is represented by a matrix p(a) = (p,,(a)), where 


LoS SY pula)u, for alla € A. (5.3.1) 


Here p: A — Mt,,,(k) is the regular representation and the norm and trace are 
defined in terms of it by 


Nm(a) = det(p(a)), Tr(a) = > — pila). (5.3.2) 
When A is central over k, we have m = n° and for a splitting field F of A we have 
Ay & F,. where n- = [A: k}. (5.3.3) 


Let e;, be the standard basis of matrix units for F,, and write a € A as a= ) a, je,). 
Then the equation (5.3.1) takes the form 


ea = ) Ages: 
j 


5.3 The reduced norm and trace 195 


hence the matrix p(a) has as (rs, uv)-entry (a,,6,,) and the equations (5.3.2) become 
in this case 


Nm(a) = det(a,,6,,.) = (det(a,,))", Tr = Date = n> ds. (5.3.4) 


This is most easily seen by writing A as V", where V is a minimal right ideal, corre- 
sponding to a single row of F,,. The right action is right multiplication by a, so that 
the matrix p(a) is the diagonal sum of n terms a, and we obtain (5.3.4). In particular 
this shows that Tr = 0 whenever n is a multiple of char k, and it suggests that we can 
get more information by taking the determinant and trace of a itself as invariants. 

This is accomplished as follows, for any central simple k-algebra. Let F/k be a 
Galois extension which splits A: 


hi ae 


where A is the natural embedding and y is an isomorphism. Then Ay embeds A in F,, 
and there we have the usual norm and trace. We define the reduced norm N(a) and 
the reduced trace T(a) of any a € A as 


Nax(a) = N(a) = det(aay), Ty.(a) = T(a)=tr(aAp) foraeA. (5.3.5) 


We note that whereas A is the canonical mapping a! a @ 1, yz is not uniquely deter- 
mined, but the definition (5.3.5) is independent of the choice of jz, for two iso- 
morphisms of Ar with F, differ by an automorphism of F,, which must be inner, 
by the Skolem—Noether theorem, and so leave N, T unaffected. From the definition 
N(a), T(a) lie in F, but if o is a k-automorphism of F, then o induces a k-auto- 
morphism of F, which gives another representation ai (aaj). Since A is a 
k-algebra, o leaves a € A fixed and so N(a)° = N(a), T(a)° = T(a). This holds for 
all o € Gal(F/k), hence N(a), T(a) € k. Further, if F’ is another separable splitting 
field of A, we can find a Galois extension E to contain both F and F’, and it follows 
that F and F’ give rise to the same N and T. The following familiar properties of 
norm and trace are easily verified; here {A : k] =n. 


R.1 N(ab) = N(a)N(b), N(wa) = aN(a), N(1) = 1, where a € k, 
R.2 T(a+ b) = T(a) + T(b), T(aa) = aT(a), T(1) = 4, 

R.3. T(ab) = T(ba), 

R.4 Nm(a) = N(a)", Tr(a) = n.T(a). 


We also have a product formula for the reduced norm and trace. For a field exten- 
sion F/k we shall write Ny. and T;:, for the usual norm and trace. 


Proposition 5.3.1. Let A be a central simple K-algebra and B a simple subalgebra of A, 
with centre F. Suppose that Deg A = n, Deg B= r. (F: k] = t; then rt|n, say n = rst, 
and for any b € B, 


Najk(b) = Ne-g(Npyr(b))*,  Taze(b) = 8.Tp,4( Tp (b)). (5.3.6) 


196 Central simple algebras 


Proof. Let E be a Galois splitting field of B which also splits A. Then Az = E, and 
under the mapping A — Ag, B becomes 


BQ, E=(B@,F) Op EX (BQ; E) @& F = (E @ F),. (5.3.7) 


We thus have an embedding of (FE @; F), in E,, so E, is an r X r matrix ring, 
E,, = C,, where C is simple Artinian, hence C & E,,/r by uniqueness. 

Since Ag = E,, there is a unique simple right Ay-module V = E”, By (5.3.7), 
B © E is faithfully represented by endomorphisms of U = (E@ F)". Now [U: E] = 
r.[E@®F:E]=rt and B® E acts on V, hence V © U° for some s, and a comparison 
of dimensions shows that 1 = rst. For any b € B we have b@1€ BQ; E 2 E,, so 
b@1 is represented by an rxr matrix and Ng;-(b) = det(b)@1), Tz r(b) = 
Tr(b @ 1). If we now tensor with F and consider )@1®@1 in (B@;- E)Q, F= 
(E ®; F),, we have 


det(b@1@1) =Ni4(Nzg-e(b)), Tr(b@1@ 1) = Te,x( Ts, (b)), 


and since V & U’, we obtain (5.3.6). Ha | 
We note the special case B = F: 


Corollary 5.3.2. Let A be a central simple k-algebra of degree n and F a subfield of A, of 
degree t over k. Then t|n, say n = st, and for any a € F, 


Nyj(a) = Ne x(a)’, Tajp(a) = 5.T eye (a). ri 


In particular, when A is a division algebra and F a maximal subfield, then s = 1 and 
the norm and trace in F coincide with the reduced norm and trace in A. Further, if in 
Corollary 5.3.2 we take F = k(a), then Np,;,(a) is (up to sign) the constant term of 
the minimal polynomial of a over k, and a is invertible precisely when the constant 
term is non-zero. We deduce 


Proposition 5.3.3. Let A be a central simple algebra. Then an element a of A is 
invertible if and only if N(a) # 0. ‘< 


Let us denote the group of units of A by U(A). The reduced norm defines a map- 
ping U(A) — k* which by R.1 above is a homomorphism. Since k is commutative, 
the commutator subgroup U(A) is mapped to 1 in this homomorphism. Let us 
define the Whitehead group of A as 


K\(A) = U(A)”” = U(A)/U(AY’. 


By what has been said, the reduced norm induces a homomorphism v : K\(A) — k”. 
The kernel of this homomorphism is called the reduced Whitehead group and is 
denoted by SK,(A). We thus have the exact sequence 


1 + SK,(A) > K,(A) — k* > cokerv > 1. (5.3.8) 


For a matrix ring over a field, K, is easily determined: 


5.3 The reduced norm and trace 197 


Lemma 5.3.4. For any field k and any n > 2, except n = 2 and k = F2, Fs, 
SKi(k,) =1 and Kj,(k,) = k*. (5.3.9) 


Proof. We have a surjective homomorphism GL,,(k) > k*, given by the deter- 
minant, with kernel SL,,(k); hence SK,(k,) &= SL,,(k)/GL,(k)’. Now by Proposition 
3.5.2 we have SL,,(k) = GL,,(k)’ except when n = 2 and k = F;, F;, so (5.3.9) follows 
with these exceptions. a 


The exceptions in Lemma 5.3.4 are treated in Exercises 1 and 2. We next describe 
the diagram resulting from an algebra homomorphism. 


Theorem 5.3.5. Let F/k be a field extension and let A, B be simple algebras with centres 
k, F respectively. If there is a k-algebra homomorphism 6: A — B, then Deg B= 
d.Deg A for some d > 1, and there are homomorphisms such that the diagram 


1 —> SK,(A) — K,(A) > k* — coker vay, > 1 
: Fii(#) ¥ } (5.3.10) 
1 — SK,(B) — K,(B) — F* — coker vg:¢ > 1 


commutes, where the mapping k* —> F* is the power mapping x1-> x". 


Proof. It is clear that 6 maps U(A) to U(B) and so induces a homomorphism 
K,(@) : K;(A) > K,(B). The F-subalgebra of B generated by Aé@ is a homo- 
morphic image of Ar and hence, by the simplicity of the latter, isomorphic to Ar. 
If ae A, then Ny4x(a) = Na, .-(a) = Np(,). p64) where r.{F(a) : F] = Deg Ar, 
and Npz-+(aé) = Ne(q)/p64)"> where s.{[F(a) : F] = Deg B. Now Deg Az = Deg A 1s 
a divisor of Deg B, because Ar is embedded in B. Thus Deg B = d.Deg A and so 
s.{F(a) : F] = d.Deg A = dr.|F(a): F]; therefore s=rd and Ng -(a) = Najx(a). 
This shows that the central square in (5.3.1) commutes and this determines the 
outer vertical arrows. P| 


We note the special case when A is any central simple k-algebra and B = A;-. Then 
we obtain an exact commutative diagram of the form (5.3.10) with B = A;, where 
the mapping k* —> F™ is the inclusion mapping (because now d = 1). In particular, 
taking F to be a splitting field of A, we have an isomorphism K,(Ar) & F’, by 
Lemma 5.3.4 (with the exceptions listed), hence SK)(Ar) and coker v4, ;- are then 
trivial. We remark that any a € K satisfies N(a) = a”; it follows that coker v is a 
group of exponent dividing n, the index of A. It can be shown that SK,(A) has 
finite exponent dividing [[p%'~' where []p% is the index of A (see Draxl 
(1983)). In fact for many ground fields, e.g. all algebraic number fields, it can be 
shown that SK,(A) = 1 and it was an open problem for many years whether 
(apart from the trivial exceptions of Exercises 1 and 2) algebras with non-trivial 
reduced Whitehead group exist (Tannaka—Artin problem). In 1975 Vladimir Platonov 
gave examples of algebras with non-trivial reduced Whitehead group; we shall meet 
some simple examples due to Peter Drax] later, in Section 7.3. 


198 Central simple algebras 


The reduced norm can be used to show that By, is trivial for certain fields k. In any 
central division algebra A of degree r we have N(x) 4 0 for x # 0, and taking a basis 
U,,.--,U,(n =r~) of A, we can write the general element of A as x = )> &u;. Now 
N(x) become a form, i.e. a homogeneous polynomial of degree r in the r° variables &,. 
A field k is said to be quasi-algebraically closed or a C-field if every form of degree d 
in n > d variables has a non-trivial zero. With this definition we have 


Theorem 5.3.6. Every C,-field has a trivial Brauer group. 


Proof. Let k be a C,-field; we have to show that there are no central division algebras 
other than k. Let D be a central division algebra of degree r over k. The reduced norm 
N(x) is a form of degree rin r° variables, and N(x) = 0 has no non-trivial solutions, 
hence r> <r, so r= 1 and D=k, as claimed. | 


An obvious example of C)-fields are the algebraically closed fields; we shall soon 
meet other examples. For the moment we note a reduction that is sometimes useful: 


Proposition 5.3.7. Any finite extension of a C,-field is a C,-field. 


Proof. Let F/k be an extension of degree r and take a basis ¥),...,v, of F over k. 
If FCG wae x,) is a form of degree d <n with coefficients in F, let us write 
x, = >- & iv; and consider 


Ol Sisae Sue! = Neal f( >> i yeas yf), 


We claim that g is homogeneous of degree dr in the &’s: 


BOA +s Air) = Nea f(D Agi... DAEs) } 
=Nea(a'f (So bivn Does) = A Neal). 


Clearly g has coefficients in k, and it is of degree dr in the nr variables &;;. Since 
dr < nr and k is C), it follows that g(€') = 0 for some &;, € k, not all 0. This 
gives x) = )° &..¥; € F not all 0, such that Ne,x(f(x’)) = 0, hence f(x’) = 0, so f 
has a non-trivial zero in F, as claimed. hey 


Let us show that every finite field is C}. By Proposition 5.3.7 we can limit ourselves 
to F,, but that is no easier. We shall need a formula for power sums in F,: 


Lemma 5.3.8. Let k = F, be the field of q = p’ elements. Then 


-l if g- 
neil Sew 
0 


ey otherwise. 


Proof. We have x#~>'=1 for x40 and x#~'=0 for x=0, hence S,_) = 
So x4>! =q—1= —]; similarly, Si(q—1) = —1. When m is not divisible by q — 1, 


5.3 The reduced norm and trace 199 


then a” 41 for some aeék*, hence S,= S>x" => (ax)"=a'S,, so 
(1 —a’")S,, = 0 and since a™ 4 1, we conclude that S” = 0. a 


With the help of this formula we can show that the number of points on the 
hypersurface over F, defined by a polynomial in more variables than its degree is 


divisible by p: 


Theorem 5.3.9 (Chevalley—-Warning, 1934). Let f be a polynomial in n variables over 
k = F,, where q = p’. Write V(f) for the set of zeros of f. Ifn > d, where d is the degree 


of f, then 
IV(f )| = 0 (mod p). (5.3.11) 


In particular, if f has zero constant term, then it has a non-trivial zero. 


Proof. For each x € k” we have 


1 ifxe V(f), 


=< q-) — 
ae ={, if x ¢ V(f). 


Thus 1 — f(x)4~! is the characteristic function of V(f), and summing over all 
points x of V(f), we find 


VPLS > =f) =—>o f(xy? 


because the total number of points in k" is gq" = 0. Now f(x) is a linear combination 
of terms x,'...x'". We have 


Dee X= a ee I[s.. (5.3.12) 


f=] x,ek | 
If v; = 0 for some 7, then S,, = 0 (mod q) and we get zero, so we may assume that 
1, > 0 fori=1..... n. But by Lemma 5.3.8, S,, = 0 unless gq — 1|m, and since 


Y>v, < d(q— 1) < n(q — 1), it follows that some v; is not divisible by q — 1. So in 
any case the sum in (5.3.12) is zero and (5.3.11) follows. 

Moreover, if f(0)=0, then the number of non-zero roots of f =0 is 
= —1 (mod p), hence it is non-zero. |= | 


If f is homogeneous of positive degree, its constant term is 0 and we obtain 


Corollary 5.3.10. Every finite field is C). B 


This then shows that Bp, = 0 in Exercise 4 we shall meet another proof of this fact. 
As a third example of C\-fields we consider function fields of degree 1: 


Theorem 5.3.11 (Tsen’s theorem). Let k be an algebraically closed field and F a field 
of functions in one variable over k. Then Be = 0. 


200 Central simple algebras 


Proof. F is a finite algebraic extension of the rational function field k(t). By Proposi- 
tion 5.3.7 it will be enough to show that k(t) is a C)-field. Let f(x), ...,x,) be a poly- 
nomial over k(t), homogeneous of degree d < n. We shall show that f(x) = 0 has a 
solution when the x are polynomials in t. Write 


x; = Eig + &jt+ se + &,t". 


The coefficients of f are rational functions of t and on multiplying f by an element 
of k(t) we may take them to be polynomials in t, of degree < k, say. Then 


f (Rise 5m) = po pit + 2c page 


where p is a form, i.e. a homogeneous polynomial in the &’s with coefficients in k. We 
thus have rd + k + 1 forms p, in the (r + 1)n variables &;;. We eliminate &, by taking 
a form in which it occurs and forming resultants with the remaining p,; this 
diminishes the forms and variables by one. By continuing this process we eventually 
obtain a form in at least two variables, provided that the number of variables is 
greater than the number of forms, i.e. (r+1)n > rd+k+ 1, and this holds for 
suitable r because d < n. This means that all the p have a common zero, as we 
had to show. | 


Exercises 


1. Show that for A = St,(F,), SK,(A) = K,(A) = C:, the cyclic group of order 2. 
2. Show that for A = 99t.(F3), SK,(A) = C3, K,(A) = Cg. (Hint. Verify that the 


x | | 
mapping (* ‘+ I> x extends to a homomorphism SL2(F3) — F] (Cohn 


(1966].) 

3, Let A be a central simple k-algebra and let p be a prime dividing the index of A. 
Show that there is a finite extension F of k such that Ag has index p. 

4, Use Theorem 5.1.14 to show that every central simple algebra over a finite field F 
splits and deduce that F has trivial Brauer group. 

5. Show that a field is quasi-algebraically closed iff, for all n > 1, every form of 
degree nm — 1 in n variables has a non-trivial solution. 

6. Show that a field is algebraically closed iff, for all n > 1, every form of degree n in 
n variables has a non-trivial zero. (Hint. if k is not algebraically closed and fis an 
irreducible polynomial of degree > 1, take the field F = k[{t}/(f ) and consider the 
norm of F/k.) 


5.4 Quaternion algebras 


The first skew field was discovered by William Rowan Hamilton in 1843. His 
aim had been to find a generalization of the complex numbers, to represent three- 
dimensional vectors; it took some 12 years to realize that four dimensions rather 
than three were needed and that the commutative law had to be given up. The 


5.4 Quaternion algebras 201 


algebra he found, which he called the quaternions, was a four-dimensional R-algebra 
H with basis 1, i, j,k, and multiplication table: 


(5.4.1) 


The group of order 8 generated by i, j, k is called the quaternion group. Hamilton 
and his followers developed an elaborate geometrical calculus on the basis of the 
quaternions, but this will not concern us here. For us the quaternions form the 
simplest division algebra and an important tool in the general theory. 

Let k be any field of characteristic not 2 and let a, b € k*. We define the quaternion 
algebra (a, b; k) as the k-algebra with basis 1, u,v, uv and multiplication rules 


>) ] 
uma, vw = b, vu = —uv. (5.4.2) 


In this notation Hamilton’s algebra becomes ( — 1, —1; R). When char k = 2, we 
define the quaternion algebra (a, b; k| as the algebra with basis 1, u, v, wv and mullti- 
plication rules 


wsavitv=b, v= uvt+u. (5.4.3) 


It is easily checked that in each case the quaternion algebra is central simple; hence by 
Wedderburn’s theorem, it is either a division algebra or it is split, i.e. a full 2 x 2 
matrix ring over k. 

Let 4 be a quaternion algebra. Then any element @ not in k is quadratic over k; 
its equation may be written 


a“t(a)a + nla) = 0. (5.4.4) 


where t(a@) and n(q@) are the trace and norm respectively. Explicitly, if @ = t + xu+ 
yv + zuv, then for char k 2, 


t(a) = 2t, n(a) = t*x*a— yb — z°ab, 
while for char k = 2 we have 
tla) =y. nla) = to +x-at+y b+z7ab+ ty + xz. 


This is most easily seen by observing that 4 has an involution, i.e. an anti- 
automorphism whose square is 1, ai>a, such that t(a) =a+a, n(a) = ada. 

Our first result shows that the quaternion algebras effectively include all four- 
dimensional division algebras. 


Theorem 5.4.1. Let k be any field. Then any central simple k-algebra A possessing a 
two-dimensional splitting field is either split or a quaternion algebra. 


202 Central simple algebras 


Proof. Since A has a two-dimensional splitting field, it is four-dimensional by 
Theorem 5.2.6, and so is either k. or a division algebra. Leaving the first case aside, 
we see by Theorem 5.2.8 that the splitting field may be taken to be a subfield of A and 
separable over k. Suppose first that char k 4 2; then we may take the splitting fheld to 
be F=k(u), where u? =a ek. Clearly wi —u defines an automorphism of F, 
which by Corollary 5.1.8 is induced by an inner automorphism of A, say 
xtov~!xv. Hence v~'uvy = —u, and v° centralizes A and so lies in k, say v- = b € k. 
Now it is easily verified that 1, u, v, uv are linearly independent and clearly span A, 
and they satisfy (5.4.2) by construction. 

When char k=2, we can take as our splitting field F=k(v), where 
vi +tv=bek. Now vi+v+1 is an automorphism, so for some u € A we have 
u-'yu=v+1. Again uw centralizes A, so u- -aé€k, and as before 1, u, ¥, uv 
form a basis for A, and (5.4.3) holds. | 


As a consequence we have 


Corollary 5.4.2 Frobenius’ theorem, 1886). The only division algebras over the real 
field are R, C and H, the Hamilton quaternions. 


Proof. Let D be a division algebra over R. Since C is the only proper algebraic field 
extension of R, if D4 R,C, then it must be non-commutative. Let F be a maximal 
subfield of D; F is a proper extension of R, hence F = C and by Corollary 5.2.7, F is 
a splitting field of D. Thus D is a quaternion algebra, (a, b; R) say, by Theorem 5.4.1. 
If a or b is positive, it is easily seen to split, hence a, b < 0 and on dividing the basis 
elements u by \/ — a4 and v by ,/ — b, we reach the form ( — 1, —1; R). a 


In general (a. b; k) may be split; conditions for this to happen are given by 


Proposition 5.4.3. Let H = (a. b; k) be a quaternion algebra over a field k of charac- 
teristic not 2. Then the following conditions are equivalent: 


ae see es oe 

) Ai splits, 1e. A & ka, 

) H 1s not a skew field, 

) n(x) =0 for some x £0 in H, 

) a= N(q@) for some a € k(./b), where N is the norm from k(,/b) to k, 
) ax- + by" = 2° has a non-zero solution in k. 


Proof. (a) = (b). Consider the map H — k» defined by 


E 0 ) ( 0 1 0 7 
I> Vie ,UVvi> 
0 —-] —! 0 1 0 


where u° = 1, v> = —1. It is easily checked that this map preserves the defining rela- 
tions of H and so defines a homomorphism from H to kp. It is clearly surjective, and 
so it is an isomorphism by a comparison of dimensions. 

(b) => (c) 1s clear, as is (c) => (d), for if n(x) # 0, then x has an inverse, as we see 
from (5.4.4). 


5.5 Crossed products 203 


(d) = (e). If b is a square in k, then k(./b) = k and the conclusion follows. Other- 
wise take q#0 with n(q)=0; on writing g=t+xu+yv+zuv, we have 
0=n(q) = t° — ax? — by” — abz’, hence 


a = (t- — by~)/(x" + bz*), 


and this shows a to be a norm in k(./b). 
(e) = (f). Ifa = N(@) = x- — by’, then a + by? = x’, so (f) holds for (1. y, x). 


Finally (f) => (a): Let ax? + by* =z’, where x, y are not both zero, say x £ 0. 
Then 


] y 9 2 9 
42" = (= y: 
a ax AX 


changing variables, we have 27 — by> = a~!. Taking the basis of H to be 1, i, j, k, we 
put uw = zi+ yk; then u- = az? — aby” = a(z* — by*) = 1. Thus u* = 1; further, we 
have ju = —uj, so if v= [(1— b) + (1+ b)u]j/2b, then uv = —vu and +" = —-1. 
This shows that H = (1, —1; k). a 


Exercises 


|. Verify directly that (1. 1; k) is split. 

2. Show that in the Hamilton quaternions the equation x- + 1 = 0 has infinitely 
many solutions, all conjugate. 

3. Show that two elements of a quaternion algebra satisfying the same irreducible 
equation either commute or are conjugate. Deduce that every quaternion of 
norm | is a commutator, i.e. SK; (H) = 1. 

4. Show that in characteristic 2, (a, b; k| splits iff a = N(@) for some a € k(so '(b)), 
where £o(x) = x? — x. 

5. Show that (a,b:k) is multiplicative in each factor, ie. (a,b; k) ® 
(a’. b> k) = (aa', b: k) and similarly for the other factor. Likewise for (a, b; k| 
when char k = 2. 

6. Show that if H & (a. b: k) is not split but is split by k( /a’), then H & (a’, b’; k) 
for suitable b’ € k. 

7. (A. A. Albert, P. K. Draxl) Show that if (a’,b': k) @ (a”,b”:k) is similar to a 
quaternion algebra (a. b: k), then there exist c’.c”,d € k such that (a’, b’, k) = 
(c’,d;k) and (a",b”,k) = (c".d:k). Deduce that a tensor product of two 
quaternion algebras H, K is split iff H, K have a common splitting field which 
is separable quadratic over k. 


5.5 Crossed products 


For a closer study of the Brauer group there is a concrete representation of central 
simple algebras which is often useful, namely as a crossed product. Such a represen- 
tation does not exist for each algebra, but there is one in every Brauer class. 


204 Central simple algebras 


Definition. A central simple k-algebra is called a crossed product if it contains a 
maximal subfield F such that F/k is a Galois extension. As maximal subfield F will 
then be a splitting field (Theorem 5.2.6). 


It is easy to see that every Brauer class contains a crossed product: if D is a central 
division algebra, then D has a separable splitting field F, by Theorem 5.2.8, and the 
normal closure E of F/k is a Galois extension of k. Let |F : k] =r, [E: F] =n; then 
ECD, and [E:k] =nr, [D, :k] =n-r-, hence E is a maximal subfield of D,,, by 
Corollary 5.1.12. Thus D,, is a crossed product, though D itself need not be (because 
the maximal subfield may not be Galois over k). The situation was first studied in the 
1930s by Helmut Hasse, Adrian Albert and others, and it was found that every 
central division algebra over Q is a crossed product, but it was only much later 
that Shimshon Amitsur [1972] gave examples of central division algebras that are 
not crossed products. 

Crossed products have an explicit description which is of importance (and which 
accounts for the name). Let A be a crossed product, with Galois splitting field F over 
k as subfield. Denote by U the group of units of A and by N the normalizer of F* 
in U: 


N={ue Ulu 'Fu C F}. 
Then F* is a normal subgroup of N and we have the exact sequence 
lo>~ F* >N-T- 1], 


where I = N/F. Thus A determines a group extension N of F* by I. We shall show 
that (1) I & Gal(F/k), (i) every extension of F* by Gal(F/k) determines a crossed 
product (up to isomorphism). This will also provide an explicit form for A. 

Any u € N defines an automorphism a(u) of F* by the rule x) = yw ' xu; this 
automorphism leaves k elementwise fixed, so we have a mapping 


a:N — Gal(F/k), (5.5.1) 


which is clearly a homomorphism. Its kernel is the centralizer of F* in N, which is 
F*, because F is a maximal subfield. To show that aq is surjective, let o € Gal(F/k); 
then by Skolem~Noether (Corollary 5.1.8) o is induced by an inner automorphism 
of A, say o = a(u), where u € U. By definition, u € N, so (5.5.1) is surjective and 
this shows that  & Gal(F/k). 

Returning to our crossed product A, let us take a transversal {u,} of ! = Gal(F/k) 
in N, so that the elements of N have the form u,a (a € F*,o € T). We shall indicate 
the action of I on F by putting exponents, thus 


aui,~=u,a (aeF.o<€T). (5.5.2) 
Further, we have 


Ug: = UgrCo for some c € F”, (5.5.3) 


5.5 Crossed products 205 


where the c,., satisfy the factor set condition (by the associativity of N): 
Cparlow.?1 = Carly a: (5.5.4) 


We assert that A is determined completely as right F-space with basis u, (o € I) and 
the multiplication rules (5.5.2), (5.5.3). We know that [A:k] =n? = [A: F][F:k] 
and [F:k}] =n, hence [A: F] =n, so the dimension is correct, since there are 
n= || basis elements. It only remains to show that the u, are right linearly 
independent over F. If there is a non-trivial relation 


> UsA, = 0, wherea, € F, (5.5.5) 


let us take such a relation with the fewest non-zero coefficients. Pick p € I such that 
a, # 0 and multiply on the left by u, ' so as to obtain a relation (5.5.5) with a, + 0. 
The left-hand side of (5.5.5) cannot consist of a single term, hence a, # 0 for some 
t #1. Let be F be such that b' + b and take the commutator of (5.5.5) with Db: 


C= > ugb° a, — >. te = a UA,(b° — b). 


The coefficient of u, is a,(b’ — b) 4 0, so this relation is non-trivial, but it has fewer 
terms than (5.5.5), because the coefficient of u; is a;(b — b) = 0. This contradicts 
the minimality of (5.5.5) and it shows that the u, are right F-linearly indepen- 
dent. We note that this is essentially the argument of Dedekind’s lemma (BA, 
Lemma 7.5.1). 

Suppose now that we are given a (finite) Galois extension F/k with group I, and a 
group extension N of F* by I’, where I’ acts on F by automorphisms. Let us take a 
transversal {u,} of T in N; this determines a factor set for which (5.5.3) holds. We 
define an algebra A by taking the right F-space on the u,, as basis, with multiplication 
defined by (5.5.2), (5.5.3). Then we claim that A is a crossed product. 

In the first place, A is simple, for if A is a non-zero quotient, it is spanned by the 
tg (o € I) over Fand u, # 0 because u, is a unit in A and so cannot map to 0. Now 
the same argument as before shows that the 1, are linearly independent over F, hence 
the mapping >> ua, i> >> ued, is injective and A is simple. 

Next we note that A has centre k. For if x = >°u,a, lies in the centre, then 
xb = bx for all b € F, so 4° u,a,(b® — b) = 0. Hence a,(b” — b) = 0 for all be F 
and o € T, therefore a, = 0 foro #1, and x = u,a, € F. Now ux = XUy = Ug X°, 
hence x” =x for all o€ T, and so xe€k. Thus k is the centre of A. Finally 
[A : F] = |[F:k] by construction, so F is a splitting field of A. This proves 


Theorem 5,5.1. Any crossed product A over k with Galois splitting field F contained in 
A is defined up to tsomorphism by an extension N of F* by Gal(F/k) and conversely, 
any such extension N defines a crossed product. Ee 


We now examine when two factor sets define isomorphic crossed products. 
Identifying our two isomorphic algebras, we have to compare two transversals 
{io}, {t¢,} in our crossed product A; two such transversals are related by equations 


HW = U,a, forsome a, € F*, 


206 Central simple algebras 


and if the factor set for {u’} is {c’} = {c), _}, then 


oe i Co.tAgy AyAr: 
hence the factor sets c.c’ are associated (see (3.1.13)). The factor sets form a group 
C under multiplication, the group of 2-cocycles, in which the bounding cocycles 
form a subgroup B. These are the cocycles associated to 1: 


-luot 


Cpa Bae 


The quotient C/B is just H°([", F*), the second cohomology group of P with coeffi- 
cients in F*, and by Theorem 5.5.1 we have a mapping 


Be Fs Be (5.5.6) 


The above remarks show this mapping to be injective and its image is the relative 
Brauer group B(F/k), the subgroup of Brauer classes split by F, already encountered 
in Section 5.2. Since each central simple k-algebra has a separable splitting field, 
contained in a Galois extension of k, it follows that B; is a union of the B(F/k), 
as F ranges over the finite Galois extensions of k. 

It remains to show that (5.5.6) is a homomorphism. To establish this fact we need 
to verify that the tensor product of algebras corresponds to the Baer product of the 
extensions. Take w € B,, with Galois splitting field F and let B € w. Put [F: k] =n, 
{B: k] =r- and let V be an F-space of dimension r; then 


Bo @F = Bi = F, = End;(V). 


Let A be the centralizer of B® in End,(V); then A ~ B and [A: k] =n. Since A 
contains F, we have [A : F] = n, so F is a maximal subfield and A is a crossed pro- 
duct. We can realize this situation by taking V to be a simple left ideal in B;; then V is 
a (B, F)-bimodule, i.e. a right (B° ® F)-module, and [V : F} = r. Moreover, A has a 
right F-basis {u,} and each u, defines an F-semilinear transformation with auto- 
morphism o: 


(ax + By)te =a°xu, + Boyu, forany x.y € Via. pe F. 


Thus A is spanned over F by semilinear transformations. 

Given two Brauer classes w, w’, let us take B € w, B’ € w’ and let V, V’ be simple 
left ideals in B;, B). respectively, where F is a Galois splitting field for w and w’. Then 
Bz = End;(V), B’? = Endz(V’), hence 


Endr(V @; V') & By @, By) & (B° @ B’’)p = (BQ, B’);. 


Denote the respective centralizers of B°, B’”, (B@B')” by A.A’. A"; then A ~ B, 
A’ ~ B', A” ~ B@B' and so (A)(A’) = (A”). Let N, N’. N” be the normalizers of 
F’ in A.A’, A” respectively; to show that (5.5.6) is a homomorphism we must 
prove that N~ is just the Baer product of N and N’. To find this Baer product, 
let NoN’ be the pullback of the mappings N > T, N’ > IT, ie. the subgroup of 
N x N’ of elements of the form (u,a@, u/B), where a, B € F, uo, u_ are trans- 


7 


versals of T in N,N’ respectively and u; =u, =1 for simplicity. The set 


5.5 Crossed products 207 


L={(A,A~')|A € F*} is a normal subgroup of N oN’, the elements of NoN’/L 
can uniquely be written as (u,. u@) and it is easily verified that this is an extension 
of F* by I with factor set equal to the product of those of N and N’, thus it is the 
Baer product. 

Now each element of N o N’ defines a semilinear transformation on V @ V’: 


(usd, Ula’): VQv io v(usa) @v'(ula’). 


Hence we have a homomorphism f : N oN’ — N”. Clearly it is surjective and 
the kernel consists of all (u,@. ua’) inducing the identity, ie. 0 = 1, aa’ = 1. 
Hence ker f = L and so N" is isomorphic to Nc N’'/L, the Baer product; this 
shows (5.5.6) to be a homomorphism. We sum up the result as 


Theorem 5.5.2. Let F/k be a finite Galois extension with group T. Then 
B(F/k) = H(T,F*). | (5.5.7) 


Once we have the homomorphism property, the injectivity also follows from the 
explicit criterion for splitting: 


Proposition 5.5.3. Let A be a central simple k-algebra of degree n which is a crossed 
product with maximal subfield F. Then A = 9MN,(k) if and only if the extension of 
F* by Gal(F/k) in A splits. 


Proof. Write I’ = Gal(F/k). If the extension of F” by I splits, then we can realize it 
with 1] as factor set. Put A = End,(F)  k,,. Each o € T acts as k-linear transforma- 
tion of F = k", while F itself acts by right multiplication. By Dedekind’s lemma (BA, 
Lemma 7.5.1) the o are linearly independent over F, so we obtain an n-dimensional 
F-space, i.e. an n--dimensional k-space, which must be all of A, by a comparison of 
dimensions. Thus A is realized as a crossed product. Conversely, any factor set for the 
algebra k is associated to the trivial factor set (which we saw defines k) and hence 
itself corresponds to a split extension. a 


Let us examine the Brauer group in a little more detail. 


Theorem 5.5.4. For any field k the Brauer group By is a torsion group. More precisely, if 
w © B; has index r, then w" = 1. Hence for an extension F/k of degree n we have 
n.B(F/k) = 0. 


Proof. Let w € B, have index r and take a Galois splitting field F of w, where 
[F : k] =n, say. By Theorem 5.5.2, w corresponds to an element c of H°(T, F*) 
and it follows from Proposition 3.1.6 that the order of c and hence of w divides n, 
but we want to get the sharper bound r. 

Let A € w bea crossed product with F as maximal subfield and let V be a minimal 
right ideal of A; then [V : F] = r and we can represent A by F-linear transformations 
of V. With a right F-basis v),....v, of Vwe have vja = )_ va; for any a € A, or in 
matrix form 


(vja = (vq. 


208 Central simple algebras 


where (v) = (4),...,¥,) and a@ = (ajj). In particular, if u, > U,, we have 
Wut = W)0sti- = (Vn =) UU 
(V)UgUr = (V)Uerls.r = (V) Usrla.r, 
hence 
Cee — 2 OF Ope (5.5.8) 


where the U, are r x r matrices over F. Now write d, = det U, and take determi- 
nants in (5.5.8): 


ef =: T 
Adore, , = ard, 


hence {c’ _} is a splitting factor set; therefore (A)’ = 1. | 


If k is a perfect field of characteristic p, then it can be shown that the p-component of 
B, is trivial {see Exercise 2). 

The order of w as an element of B, is called its exponent. For any central simple 
algebra A, its exponent is defined as the exponent of its Brauer class, and Theorem 
5.5.4 may be expressed by saying that for any Brauer class, the exponent divides the 
index. The question naturally arises whether the exponent is always equal to the 
index. For rational algebras this is true, but not in general; however we do have 
the following connexion: 


Proposition 5.5.5. For any w € B, the index and the exponent have the same prime 
factors. 


Proof. Let w have index m and exponent t, so that t|m, by Theorem 5.5.4. We have 
to show that any prime factor p of m also divides t. Take a Galois splitting field F 
of w, with group T° and let S$ be a Sylow p-subgroup of T° with fixed field E. 
Then [E: k}] = (T: S) =v is prime to p, while |S| = p*. For any A € w, the index 
reduction factor from A to Ag divides v (Proposition 5.2.3) and so is prime to p, 
hence the index of A; is still divisible by p and it is enough to show that p also divides 
the exponent. Now Ag is a central simple E-algebra which does not split but which is 
split by F. Since [F: E] = p*, its exponent is a positive power of p, as we had to 
show. 
This result leads to a remarkable decomposition formula: 


Theorem 5.5.6. Any central division algebra D of degree m = q, ...q,, where the qj are 
powers of distinct primes, has the decomposition 


p=De@...@o", (5.5.9) 


where D") is a central division algebra of degree q,. 

We note that the assertion is that (5.5.9) is an isomorphism. The corresponding 
assertion, with ‘isomorphism’ replaced by ‘similarity’ is a trivial consequence of 
the basis theorem for abelian groups, applied to the cyclic subgroup generated by 
the Brauer class of D. 


5.6 Change of base field 209 


Proof. The class (D) has exponent n = q{...q/, where q:|q; and by Proposition 
5.5.5, gq; > 1. By the basis theorem for abelian groups, (D) can be written as a 
product of classes which are powers of (D) with prime power exponent. Let pu 
be a division algebra similar to a power of D with exponent q‘; then 


D™@...@D"~ D. 


By Proposition 5.5.5 the D) have coprime degrees and by Proposition 5.2.10 we 
have a division algebra on the left, hence the two sides are isomorphic. go 


A central simple k-algebra is called primary if it is not equal to k and it contains no 
proper central simple subalgebra. Thus if A is not primary, it is either k or it has a 
central simple subalgebra B + k, A. By Proposition 5.1.5 we have A = B @ B’, where 
B’ is the centralizer of B in A. Bearing in mind Theorem 5.5.6 and the relation 
k,; & k, @k,, we see that any primary algebra is a division algebra of prime power 
degree or of the form k,. Thus we obtain 


Proposition 5.5.7. Every central simple algebra is a tensor product of primary algebras. 
The primary k-algebras are IN, (Kk), where p is a prime, and certain division algebras of 
prime power degree. | 


A division algebra of prime power degree is not necessarily primary, though this 
does hold over an algebraic number field. 


Exercises 


1. Let F/k be a finite Galois extension with group I" of order n. Show that 

F@,k0 &k,. (Hint. Use a normal basis for F/k.) 

Let k be a perfect field of prime characteristic p and D a central division algebra. 

Show that Deg D is prime to p. (Hint. Use Theorem 5.2.8 to show that D contains 

no proper extension of degree p over k.) Deduce that By, has trivial p-component. 

3, Show that every division k-algebra has a splitting field which is a tensor product of 
extensions of k with prime power degrees. 

4. Let G be a group whose centre Z is free abelian and of finite index n in G. By 
constructing a suitable crossed product with group G/Z, show that G can be 
embedded in a division algebra of degree n. 

5. Let A, B be central division k-algebras that are crossed products with groups G, H. 
Show that if A @ B is a division algebra, then it is a crossed product with group 
Gx A. 


tO 


5.6 Change of base field 


Let us consider the effect of changes in the base field on a crossed product. We begin 
by recalling a result from Galois theory (BA, Theorem 7.10.3): 


210 Central simple algebras 


Proposition 5.6.1. Let F/k be a Galois extension and E any field extension of k, where 
E, F are both contained in the same field. The EF/E is Galois, with group isomorphic to 
the subgroup of Gal(F/k) corresponding to EN F. 


Proof. This is essentially a translation of the parallelogram rule applied to the Galois 
groups. The isomorphism is obtained by taking an automorphism of EF/E and 
restricting it to F; this provides an isomorphism with Gal(F/E NM F). ol 


In the above situation let us write G = Gal(F/k) and denote by H the subgroup 
leaving EM F fixed, so that H © Gal(EF/E). Any factor set {c}: G x G—> F* when 
restricted to H yields a factor set {c’}: H x H — (EF)”. It is clear that a split factor 
set has a split restriction, hence the inclusion H C G gives rise to a homomorphism, 
the restriction 


res: H-(G, F*) > H°(H. (EF)*). (5.6.1) 
In what follows we shall write (F/k.c) for the crossed product over F/k with factor 


set {c}. 


Theorem 5.6.2 (Restriction theorem). Let F/k be a finite Galois extension with group 
G = Gal(F/k), let E/k be any extension (within a field containing F) and let H be the 
subgroup corresponding to K = EM F, so that H = Gal(EF/E). Then 


(F/k, c)p ~ (EF/E.c’), (5.6.2) 
where {c’} is the factor set {c} restricted to H: 


H°(G. F) —> H°(H.F) 
Al i 


OK 


B(F/k) —> B(F/K) 


Proof. We shall show 


(F/k.c) @, K = (F/K.c)@k,. where r=[K: k]. (5.6.3) 


(F/K,c') @x E& (EF/E,c’). (5.6.4) 


It is clear that (5.6.2) is a consequence of (5.6.3) and (5.6.4). Let us write 
A = (F/k,c); this algebra contains F and hence K as a subfield. If K’ denotes the 
centralizer of K in A, then by Brauer’s theorem (Theorem 5.1.10), 


K' @k, ~A®Q.K. 


and this will establish (5.6.3) if we can show that K’ ~ (F/K.c’). In A take 
X= )o Ug, (A, € F); we have x € K’ iff xy = yx for all ye K, ie. Yo usagy = 
0 VUcas = DU ¥° Ag. Thus we must have a, = 0 whenever y” # y for some yj, i.e. 
when o ¢ Gal(F/K) =H. This shows that K’ = {> \u,a,la, € F, o€ H} and 
(5.6.3) follows. 


5.6 Change of base field 211 


To prove (5.6.4) we note that the K-algebra homomorphism 


is surjective and both sides have dimension [EF : E] = |H| = [F: k} over E, so it is 
an isomorphism (thus the composite EF is independent of the choice of embedding; 
this depends on F being normal over K). Hence 


(F/K,c') @c E= 0 (ua @ DEF = (EF/E,¢’), 
acH 


and (5.6.4) follows. | 


There is a second operation called inflation, corresponding to the natural homo- 
morphism G — G/N, for N aG. Given a factor set on G = G/N, we define a factor 
set {c} on G by the inflation rule (where 0 1 a is the natural homomorphism from 
G to G/N): 

Cor = Cat: (5.6.5) 


Theorem 5.6.3 (Inflation theorem). Let k C K C F, where F/k, K/k are Galois exten- 
sions, G = Gal(F/k) and N is the (normal) subgroup of G corresponding to K. 

Given any factor set {c} on G/N and the corresponding factor set {c} on G derived by 
the inflation rule (5.6.5), then 


(F/k,c) = (K/k.c) @k,, wherer = [|F: K]. (5.6.6) 
Proof. Let {(F : K] = 1, [K : k] =s, G= G/N and define B = (K/k, c) @ k,. The field 
F can be embedded in K and hence in B; now B is a central simple k-algebra split by 
K, hence also by F and Deg B = rs = [F : k], so F is a maximal subfield of B, there- 
fore B is a crossed product. We shall prove that B = (F/k.c) by constructing an 


explicit embedding of F in B; this will establish (5.6.6). 
Take a K-basis e;..... e, for F and define T(x) = (t;;(x)) by 


(= De ty (x)e; for x € F. 
On writing e = (e)....,e,)', we can express this equation in matrix form as 
ex = T(x)e, where T(x) € K,. (5.6.7) 
Since e° € F for all o € G, we have 
e’ = P,e, where P, € K,. (5.6.8) 
It follows that P,,e = e°' = (P,e)' = Pre" = P'P-e, hence 
Par = PUP, (5.6.9) 


where we have replaced t by Tf in the action on P because the latter has entries in K. 
Applying o to (5.6.7), we find e°x® = T(x)%e’, ie. P, T(x) = T(x)°P, and again 
T(x) has entries in K, so that 


P,T(x") = T(x)"P,. (5.6.10) 


212 Central simple algebras 


We claim that for any right K-basis ug of (K/k.c), 
Sa (5.6.11) 


is a right F-basis for B = (K/k,c) ® k,, with the isomorphism 


» Vo <> »: u=P..T (as): 


For the proof we need only verify the conditions on v,; using (5.6.10), we have 
Lies Obes Si) Paar Se) 
and 
Ns ii Fis aed ey aa en uguzP'P, MeO er a= Vila: 


This shows that B = (F/k.c), as claimed, and it proves (5.6.6). + | 
Since r > 1 except in the trivial case F = K, we obtain from Theorem 5.6.3, 


Corollary 5.6.4. A central simple algebra obtained by inflation is never a division 
algebra. | | 


We note that the natural homomorphism G — G/N induces the inflation homo- 
morphism H°(G/N, K*) — H-(G,F%) and Theorem 5.6.3 can be expressed as a 
commutative square, which together with the previous ones gives the commutative 
diagram 


int res 


O— H*(G/N, K’) —> H7(G, F*) —> H?(N, F”) 
/ { v 


Inc 


0 — B(K/k) —> B(F/k) —>  BIF/K) 


The bottom row is easily seen to be exact: a central simple k-algebra split by F will 
split as K-algebra iff it is split by K. Hence the top row is also exact. 

As a third operation we have the corestriction (or transfer), which for k C F 
provides a homomorphism Br — By. 

Let B be an F-algebra with a finite group G of automorphisms such that each 
element of G other than 1 restricts to a non-trivial automorphism of F. As usual 
we write BY and F© for the subset fixed by G. Given a k-algebra A, if B = Az, 
where k = F®, then B“ = A, as is easily verified. 

We begin by showing that B° can be expressed in terms of the trace, where for 
b € B we define tr b=). b” and tr B= {tr b|be B}. 


Lemma 5.6.5. Let B be an F-algebra with a finite group of automorphisms which induce 
distinct automorphisms on F. Then tr B coincides with the fixed algebra B’ and if 
F° =k, then 


BX B° @,.E. (5.6.12) 


5.6 Change of base field 213 


Proof. Clearly tr B C B®’ and both tr B,B° are vector spaces over k. Let C be the 
F-space spanned by tr B; if C C B, then there is an F-linear functional g : B > F such 
that g(C) = 0 but wy £0, say y(u) 4 0. For any aw € F we have 


0 = y(tr(au)) = o( da” u “) = Sa glu?) 


By hypothesis the automorphisms o of F are distinct and hence by Dedekind’s 
lemma they are linearly independent, contradicting the assumption that g(u) 4 0. 
Hence ¢ must vanish and C = B. 

We thus have a canonical map B° @ F —> B, which we claim is injective. Any 
element of B° @ F can be written as x = 5° b, @ a;, where b; € B°, a, € F and we 
may take the b; to be linearly independent over k. Suppose that x # 0, but that 
)_ ba; = 0; then for any £ é€ F, 


ie (b>) bya) = (> Bab = S- tr Bar, )b;. 


By the linear independence of the b; we have tr(fa,;) = 0 for all 6 € F, hence 


S° p'a? =0 forall Be F. 


and this again contradicts Dedekind’s lemma, unless @;=0 for all i, so 
x= 5°b,®a@,=0 and our mapping is injective. By comparing dimensions in 
(5.6.12), we see that this is an isomorphism. Ci 


If B is simple with centre F, then (5.6.12) shows that B° is a central simple 
k-algebra such that [B: F] = [B“ : k]. 

Let E/k be a Galois extension with group G and let B be an E-algebra. We define 
B®, for o € G, as an E-algebra on B with the same ring structure as B but with scalar 
multiplication 


a.x=a°’x for@weE, xe B. 


Given any separable extension F/k, let E/k be a Galois extension containing F/k, with 
group G, and denote by H the subgroup corresponding to F. Put n= [F:k] = 
(G: H) and let m,.... a, be a transversal of H in G: G=UHo,. Then for any 
F-algebra B the corestriction from F to k is defined as follows. Put 


BIH) — Bt @.., @ Be, (5.6.13) 
and define a G-action on B‘@) by writing, for any o € G, ojo = t;(o)o;, where 
ti(a) € H and it’ is a permutation of 1,..., n determined by o. Now put 

(b; @ aj)” = by @ iho) (5.6.14) 


To check that this indeed defines a G-action, let ajo = t,(a)o,, 0; T = t;:(t)o;-. Then 
ajot = t;(o)t; (t)o;- and 


nN # tt f} 


&) ((b; @ca,)7)? = & (D;: Qa) \ _ & (b;- @ of!" (1) )= &) (b, Qa). 


t=] ead pe] al 


214 Central simple algebras 
Now the corestriction of B is the subalgebra of B‘@") fixed by G: 
cor; 4B = (BIGH))&, (5.6.15) 


For example, taking B = F, we have F(GH) — @F@ and the fixed ring is k®" = k, 
hence we have 


corp;,(F) = k. (5.6.16) 


Proposition 5.6.6. Let F/k be a separable extension of degree n and B a central simple 
F-algebra. Then cory ,B is a central simple k-algebra which depends only on F/k, not on 
the Galois extension or the choice of transversal. Moreover, the correspondence cory: is 
a homomorphism from B; to By. 

Proof. If B is any central simple F-algebra, then C = B'“'#) is simple with centre E, 
by Corollary 5.1.3. Hence C® has centre k; now any ideal of C gives rise to an ideal 
of C, and the simplicity of the latter shows C" to be simple. 

Now let N be a normal subgroup of G contained in H and let L be the correspond- 
ing subfield; thus F C L C E and L/k is Galois, with group G/N. The transversal 
Osis co, of H in G is still a transversal of H = H/N in G= G/N. Suppose that 
A = corr ,(B) is formed as above, going via E; going via L we obtain C = BIGH) 
and it is clear that C — (B'!°-#))%. Therefore 


ce = ((BIGH) NG = (ple )ye. 


so we reach the same algebra going via L. Given any two extensions L,.L2 of F, we 
can find a Galois extension E/k containing both, and the above argument shows that 
using L; or Ly we obtain the same algebra cor;,;,(B) as if we had used E, hence 
all three cases give the same result. Further, suppose that B= B’ @B", write 
C = B\?) and define C’. C” similarly in terms of B', B”. Then by the associativity 
and commutativity of the tensor product, C= C’@C”, hence C" = C @C" and 
this shows the corestriction to be a homomorphism. mt 


For our next result we note that if K is any commutative ring and E is a commu- 
tative K-algebra, then for any K-modules U, V, writing again Uz = U @x E etc., we 
have the K-module isomorphism 


Up Oi Ve CU Oe Vy (5.6.17) 


given by the mapping (vu ®@a) @(v @ B)i> (4 @ v) @ ap. 


Proposition 5.6.7. Let F/k be a separable extension of degree n and let A be a central 
simple k-algebra. Then 


corp ;,(A) = Ae" (5.6.18) 


5.7 Cyclic algebras 215 


Proof. Take a Galois extension E/k containing F, with group G and subgroup H cor- 
responding to F. We have A; @; E = A @, E = Ay and Af. = A @ E’, for any o € G, 
hence 


(Ap) (@#) = @(Ap)® & @(A QE”) ~ A®" @( QE”), 


in terms of a transversal of H in G, as before. Taking fixed subrings and using 
(5.6.16), we obtain (5.6.18). = | 


We remark that in the isomorphism with cohomology groups, (5.6.18) corresponds 
to the formula cor o res = n. 


Exercises 


1. Let F/k be a finite Galois extension. Show that for any k-algebra A, tr Ap = A. 

. Given a field F of characteristic p #0, an F-algebra B and a group G of auto- 
morphisms of B inducing distinct automorphisms of F, show that the order of 
G 1s prime to p. 

3. Show that for a central simple F-algebra B, if [F :k] =n, then [cor B: k] = 

[B: F]". 

4, Let A be a central simple k-algebra of degree p’, where p is prime. Show that A is 
similar to a crossed product of degree p*, for some s. (Hint. Take a Galois splitting 
field and a subfield corresponding to a Sylow p-subgroup.) What can be said 
about the relations of r and s? 


bo 


5.7 Cyclic algebras 


The simplest crossed products are those with cyclic groups; they are called cyclic 
algebras and can be brought to the following simple form. Let F/k be a cyclic Galois 
extension of degree n, with group G generated by o. We shall choose an F-basis for 
our algebra A as follows: wy) =1,u,=u'(i=1....., 1 — 1), where u is an element 
of A inducing o. Since u' induces o', we have indeed an F-basis; moreover, 
u" =a@€F and since ua’ = au = u"*! = ua, we have w@® =a, so a € k. Thus the 
multiplication in A is given by 


, yitl ifitjy<n. 
nin = | ; (5.7.1) 


auliTI~" if i+g> n. 


The element u is called the canonical generator of A. We see that A is determined up 
to isomorphism by F/k, a and o, and one also writes A = (F/k.o.q@). 
Our first concern is to find when two presentations give isomorphic algebras: 


Proposition 5.7.1. Let F/k be a cyclic Galois extension of degree n. Two cyclic algebras 
(F/k,o0.a) and (F/k,o. B) are isomorphic if and only if B/a = Nr;4(c), where c € F. 
In particular, (F/k.o,a@) splits precisely when a = Ne:x(c) (c € F). 


216 Central simple algebras 


This condition is more briefly expressed by saying that @ (resp. B/a) is a norm 
from F to k. 


Proof. Assume that (F/k, 0. a) & (F/k. o. B); then the canonical generators u, v are 
related by an equation v = uc, where c € F. Hence v’ = (uc)! = uice?...c® *: for 
i=n we find v” = (uc)" = u"N(c), ie. N(c) = B/a. Conversely, if B/a = N(c), 
then the same calculation shows that (1c)” = B, so that uc is a canonical generator 
for the isomorphic algebra (F/k.o. B). | 


For example, (F/k.a, 1) &k, (by Proposition 5.5.3); this algebra can be realized 
as endomorphism ring of F = k", e.g. by taking a normal basis in F. Then o acts by 
cyclic permutation of the coordinates. 

The above presentation of a cyclic k-algebra A only provides a basis over F, but 
frequently one needs to have a basis over k. Such a basis takes a simple form if k con- 
tains a primitive n-th root of 1, say w. Then a k-basis for A can be formed as follows. 
The splitting field F is of the form F = k(v), where v" = 6 € k. Taking u € A such 
that u~ vu = wv, we have u” =a € k and so A has the k-basis u'v/ (iij=1..... 
n — 1) with the defining relations 


aa i = B.S ONY, (5.7.2) 
By a symbol (a. Bk), or (a, B),, one understands a cyclic algebra over k with the 
presentation (5.7.2). We note the following consequence of (5.7.2): 


(utv)"=a+ B. (5.7.3) 


For on expansion the left-hand side of (5.7.3) is a sum of products of degree n. 
Consider a product P involving 7 factors u and n —i factors v; by moving the last 
factor to the first place we obtain w'P, whether this factor was u or v. Now P 
occurs with all its cyclic conjugates and by what has been said, their sum is 
(lta! +...4+a"-)))P which is 0 except when 7 = 0 or n, so the sum reduces 
to uw" ++", which is a + f, as claimed. 

By applying Proposition 5.7.1 to a symbol, we obtain 


Corollary 5.7.2. Let k be a field containing a primitive n-th root of 1. Then a symbol 
(a, B: k),, splits if and only if a is a norm from k(B''") to k. | 


There remains the case when char k divides the degree of the algebra. We shall 
only consider the case of a cyclic algebra A of prime degree p over a field k of 
characteristic p. A Galois splitting field F of degree p over k contains an element v 
such that v? -v = Bek and u such that u~'vu =v 4+ 1. Hence u? =a € k and 
so A has the basis u'v! (i,j = 0.1,....p — 1) with the defining relations 


uPo=@a vP—~y= Bo wm=ulvt+l)). (5.7.4) 


This algebra will be denoted by (a-f: k],. 


5.7 Cyclic algebras 217 


We obtain a more symmetric form by putting t=u~'v. Then tu — ut = 


u-'vu—v=v+l1—v=1. To find t we note that in k, 


| | w-i) =x? -x 


further, vu~’ = u~'(v—1), therefore t? = u~!v...u°'v=u7P(v—(p—1))... 
(v—1)v =u P(v? — v) = ap. Thus A may be defined by t, u with the defining 
relations 


f= ya Se, Mm—ut = 1, 


where we have put y = B/a. 

In Section 5.2 we saw that every central simple algebra is similar to a crossed 
product. It can actually be shown that when k has a primitive m-th root of 1, 
then every central division k-algebra of exponent m is similar to a tensor product 
of cyclic algebras of degree m. This is the content of the Merkuryev—Suslin [1986] 
theorem, whose proof is beyond the level of this book (see Rowen (1988) for an 
illuminating discussion and a proof of various special cases). 


Exercises 


1. Show that (a, 6: k),, splits if it contains an element x such that 1. x,....: X are 
linearly independent and x" is an n-th power in k. 
. Show that if (F/k, o, a) is cyclic of degree n, then x" — a@ is irreducible over k and 
D contains a maximal subfield generated by a root of x" =a. 
3. Prove that (F/k.o, a) ® (F/k, a, B) ~ (F/k. 0. a). 
4. Show that if (F/k.o, a) has degree n, then for any r prime to n, (F/k, a’, a’) = 
F/k, o. a). 
5. Show that (F/k, 0. a) has exponent e, where e is the least number for which @° is a 
norm from F. 
6. (Wedderburn) Show that a cyclic algebra (F/k.o.a@) of degree 1 is a division 
algebra if a” is the least power of a which is a norm. 
7. Let k C FC E, where E/k is cyclic of degree n and [E: F] =d. Show that if 
Gal(E/k) is generated by o and o|F =, then (F/k, d,a@) ~ (F/k.o, a"). 
8. With the notation of Theorem 5.6.2, show that if E/k is cyclic, then 
(E/k. 0, &)p ~ (EF/F.o". a), where r= [EQN F: k]. 
9. Let k be a field with a primitive n-th root of 1. Show that if A = (F/k, 0. a@) isa 
cyclic division algebra of degree n*, then A can also be represented as a crossed 
product with group C,, x C,. 


tu 


Further exercises on Chapter 5 


1. Let D be a skew field, a an endomorphism and 6 a (1, a)-derivation of D. Show 
that if D is finite-dimensional over its centre, then either @ or 6 must be inner. 
2. Let D be a skew field with centre k but not algebraic over k. Show that D @ k(t) 
is a simple Noetherian ring, not a skew field (and not a full matrix ring over a 


skew field). 


218 


10. 


13. 


Central simple algebras 


. (A. A. Albert) Let D be a skew field which is totally ordered (BA, Section 8.8), 


and suppose that D is algebraic over its centre k. Show that the conjugates of 
any positive element are again positive. Deduce that the sum of the conjugates 
of any non-zero element cannot be zero, and hence prove that D must be com- 
mutative. (Hint. Use Exercise 9 of Section 5.1.) 


. (Kharchenko) Let A be a central simple k-algebra of finite degree. By regarding 


A as a right A‘-module show that every k-linear mapping of A into itself has the 
form f : x! > °a,xb;(a,.b; € A). Deduce the existence of a non-constant 
central polynomial, i.e. a polynomial with values in k (see Section 7.7 below). 


. Let F/k be a field extension. Show that there is an exact sequence 


0 — B(F/k) > B, —> Bz. 


Identify coker f in case F/k is a Galois extension. 


. (J.-P. Serre) Suppose that in a central division k-algebra D of degree n every 


extension is p-radical. Show that x” € k for all x € D. By extension to a splitting 
field obtain a contradiction, and hence give another proof of Theorem 5.2.8. 


. Let A. B be crossed products with factor sets (a). (b) respectively. Given a Galois 


splitting field F for both A and B, write F = k(#), put P= A @ B and 
e=[](@@1-1@07)/(8- 87) @1. 


where the product is taken over all o 4 1 in Gal(F/k). Verify that for the mini- 
mum polynomial f of 8 over k, f(@@ 1) = f(6) @1 = Oand (c @ l)e= (1 @cde 
for all c € F. Hence show that e is idempotent and that ePe is a crossed product 
with factor set (a)(b). 


. Let F = k(/c) (char k 4 2). Show that there is a central division k-algebra with 


F as maximal subfield iff the form x- + cy” is not universal over k (i.e. it does not 
represent every a € k” ). Similarly in characteristic 2, if F is generated over k by a 
root of x +.x+c= 0, the same holds iff x- + xj + cy" is not universal. 


. Show that SL:(F;) has order 24 but is not isomorphic to Symy. (Hint. Show that 


its derived group is the quaternion group. It is known as the binary tetrahedral 
group.) 

The Hilbert norm residue symbol (a. b), is defined to be 1 or —1 according as 
(a, b: Q,) does or does not split, where Q, is the p-adic field when p is prime and 
R when p = oo. Using Proposition 5.4.3(e), show that for fixed b and p the a 
with (a. b), = 1 form a group under multiplication. (Note that the law of quad- 
ratic reciprocity, BA, Chapter 7, Further Exercise 23, may be expressed as 
| | (a. b), = 1, where the product is taken over all primes and over p = ov.) 


. Let k be a field with a primitive n-th root of 1 and F D k. Show that for any 


week, Be F, cor; .(a. BF), ~ (a, Nepe(B)i k),,- 


. Let A be a central simple k-algebra of index p’m, where p is a prime not dividing 


m and a > 1. Show that there is a separable extension F of degree prime to p 
over k such that A; has index p*. 

Show that the Brauer classes split by a cyclic extension F/k of degree n form a 
group H-(C,,. F*) ~k* /N: 4(F*). 


5.7 Cyclic algebras 219 


14. Show that if x, y are regular elements over a field of prime characteristic p, 


ee 


satisfying xy — yx = 1, then (xy)? = xPyP + xy3 deduce further that (xy)?~! = 
yh igh bay, 


. Let D be a central division k-algebra of prime degree p. If D has a maximal 


subfield E not its own normalizer in D”, show that E/k is Galois and deduce 
that D is a crossed product. 


. (L. E. Dickson) Let D be a central division k-algebra of degree 3 and suppose 


u; € D\k has the minimal polynomial (x — 4) )(x — u2)(x — 43). Find v € D 
such that w;v = vu,,, (1 mod 3) and show that either k(v) or k(14;) is not its 
own normalizer. Deduce that D is a crossed product. (Hint. Try a quadratic 
polynomial in the u’s for 1.) 

Show that every central division algebra of degree 6 is cyclic. 


Representation theory of 
finite groups 


Although much of the theory of finite-dimensional algebras had its origins in the 
theory of group representations, it seems simpler nowadays to develop the theory 
of algebras first and then use it to give an account of group representations. This 
theory has been a powerful tool in the study of groups, especially the modular 
theory (representations over a field of finite characteristic), which has played a key 
role in the classification of finite simple groups. The theory also has important appli- 
cations to physics: quantum mechanics describes physical systems by means of states 
which are represented by vectors in Hilbert space (infinite-dimensional complete 
unitary space). Any group which may act on the system, such as the rotation 
group or a permutation group of the constituent particles, acts by unitary trans- 
formations on this Hilbert space and any finite-dimensional subspace admitting 
the group leads to a representation of the group. If we know the irreducible repre- 
sentations of our group, this will often allow us to classify these spaces 

Of course an introductory chapter like the present one is not the place to develop 
modular representations, nor the applications to physics. The plan of the chapter is 
as follows. The first four sections give a concise account of the theory based on the 
Wedderburn theorems (Chapter 5 of BA), including the basic results on ortho- 
gonality and completeness (Section 6.3) and in Section 6.4 we explain the role of 
characters. Some simplifications can be made over the complex numbers and they 
are described in Section 6.5. The rest of the chapter deals with representations and 
characters of the symmetric group in Section 6.6, and in Section 6.7 describes 
induced representations, an important technique which is illustrated in Section 6.8 
by the theorems of Burnside and Frobenius. 


6.1 Basic definitions 


Let G be any group (not necessarily finite). By a representation of G over a field k one 
understands a homomorphism 


p:G—> GL,(k), (6.1.1) 


222 Representation theory of finite groups 


where GL,(k) is the general linear group of degree d over k, i.e. the group of all inver- 
tible d x d matrices over k. Thus we have a mapping x > p(x) such that 


pP(xy) = e(x)e(y) for all x,y € G. (6.1.2) 


Since each matrix p(x) is invertible, we have p(1) = J, where 1 is the neutral element 
of G, and p(x” ') = p(x) |. The integer d is called the degree of the representation. 
For example, to find a representation of the cyclic group C; = {1, t, t-} over R of 
0 1 
degree 2, we need to find A € GL;(R) such that A’ = J. We may take A = ( ; ; 
and then p is defined by 


a=( feo ae a) =(, ) (6.1.3) 
ot=() | )-0 af y) > AO eye 


Every group has the trivial representation, obtained by mapping each element of G 
to I. 

At the other extreme we have the faithful representations, defined as homo- 
morphisms with trivial kernel; e.g. (6.1.3) is a faithful representation of C3. 

Two representations 0,0 of a group G are said to be equivalent, if they have the 
same degree, d say, and there exists P € GL,(k) such that 


o(x) =P 'p(x)P forall x eG. (6.1.4) 


It is clear that this is indeed an equivalence relation on the set of all representations 
of G. For example, if w is a primitive cube root of 1, then 


CC. eC, Ge) 


therefore the representation p of C3 given by (6.1.3) is equivalent to o, where o is 


given by 
a) (* “ 2 (° °) (1) (; i 
O — | Bs oy =) rane == . 
0 w oe 0 w 0 ] 


If we interpret the matrices of a representation of G as linear transformations of a 
vector space, we reach the notion of a G-module. A G-module is a vector space V 
over k such that each x € G defines a linear mapping v1 vx on V satisfying 


v(ixy) =(vx)y. vlasv. forallve Vix.yeG (6.1.5) 


Given two G-modules U and V, a homomorphism or G-homomorphism from U 
to V is a k-linear mapping f : U — V such that 


flux) =(fu)x forall ue U,x EG. (6.1.6) 


If ahomomorphism from U to V is bijective, its inverse is easily seen to be a homo- 
morphism from V to U; this is called an isomorphism or a G-isomorphism. We then 
say that U and V are isomorphic and write U = V. 


6.1 Basic definitions 223 


To establish the link between representations of G and G-modules, let us take a 


finite-dimensional G-module V, with basis v,...., vi over k. The action of G on 
V is completely described by the equations 
tx = Se pil); (x € G). (6.1.7) 
} 


where p;;(x) € k, and it is easily checked that the matrices p(x) = (p,,(x)) form a 
representation of G; we shall say that the G-module V with the basis ¥;,..., v4 
affords the representation p. Conversely, given a representation p = (;;) of G of 
degree d and a d-dimensional vector space V over k, we can turn V into a G- 
module by defining the action of x € Gona basis 1)..... vy by (6.1.7) and generally 


putting 
(>. avi) oe > Ot, jj (X) Vj. 


The verification that this provides a G-module is straightforward and may be left to 
the reader. 

To see that the operations of passing between representations and modules are 
mutually inverse we need to examine the effect of a change of basis on the represen- 
tation. Let V be a G-module affording the representation p relative to a basis 


Vises te vj. Thus the equations (6.1.7) hold, which may be written concisely as 

vx = p(x)V. (6.1.8) 
where v = (v).....v 4)! stands for the column of basis vectors v)..... vj. Suppose 
that ieee: uj)! is a second basis of V, affording the representation o, so that 


ux = a(x)u. (6.1.9) 

If the matrix of transformation from 1 to v is denoted by P, we have 
v= Pu. uaP'y. (6.1.10) 
Hence o(x)u = ux = (Po !'v)x = Po '(vx) = P7' p(x)v = P7' p(x) Pu. It follows that 
o(x) =P 'plx)P: (6.1.11) 


thus p and o are equivalent, and what we have shown is that different bases of a 
G-module afford equivalent representations. Moreover, since P may be any invertible 
matrix, we see that representations of G that are equivalent are afforded by the same 
G-module, for suitable bases. Further, if two G-modules afford the same representa- 
tion, they must be isomorphic. For take the modules to be V, W with bases 
a — ah Go, ear va)! , w—(w),..., wi)) and let 


Vx = p(x)v. wx = plx)w. 


Then the mapping }) a;v,;i-> }°a,w; is easily verified to be a G-isomorphism 
between V and W. It follows that isomorphic modules afford equivalent representa- 
tions, for by changing the basis in one of the modules we can make the repre- 
sentations equal. Thus we have proved 


224 Representation theory of finite groups 


Proposition 6.1.1. For any group G there is a natural bijection between the sets of 
equivalence classes of representations and isomorphism classes of G-modules. | 


By this result we can use G-modules and representations of G interchangeably; we 
shall examine various concepts from both points of view, but first we must clarify the 
connexion with modules over a ring, discussed in Chapter 4 of BA. In order to do so 
we shall recall the notion of a group algebra. For any group G and any field k we can 
form the vector space over k on G as basis, denoted by kG. Its elements have the form 
> a,x, where the summation is over all x € G, and if G is infinite, almost all the a, 
are zero. Using the multiplication in G and distributivity, we thus obtain an algebra 
kG which is known as the group algebra of G. 

We can form modules over kG, as for any ring, and it is clear that a KG-module is 
also a G-module. Conversely, a G-module V becomes a kG-module by the rule 


(>> a,x] = Yo avy, (ve V), 


where the summation is over all x € G. In particular, kG itself may be regarded as a 
right kG-module; this is the regular representation, which provides a faithful repre- 
sentation of G, since xg = x for all x € kG implies that 1 = lg = g. 

Given a G-module V, we can define a submodule as for modules over a ring, as 
a subspace V’ of V admitting the G-action, i.e. such that vx e V’ for ve V’, 
x € G. Alternatively we may regard V as kG-module and look for its kKG-submodules; 
clearly they are just the G-submodules of V. Likewise the homomorphisms between 
G-modules are nothing other than the module homomorphisms between kG- 
modules. 

Let us now examine the form taken by the representations corresponding to sub- 
modules. We consider a G-module V with a submodule V’. If +,,.... vy is a basis of 
V, adapted to V’, say v)..... v, (t < d) is a basis of V’ and the action is given by 
(6.1.8), then for i < t we have v,x € V’ and so p,;(x) = 0 fori < t <j. Thus p has 


the form 
p(x) 0 
<j : (6.1.12 
on ( A(x) p(x) 


We note that o’(x) is a representation afforded by V’, while p” is afforded by the 
quotient module V/V’ relative to the basis ¥,.,....z, where 7 denotes the residue 
class of u. Both p’ and p” are sometimes called subrepresentations of p. 

We note that if the basis of V is chosen so that the Jast t members form a basis of 
V’ (instead of the first r), then p takes the form 


y= (7 ea 
ENO pty 


A representation is said to be reducible if it is equivalent to a representation of 
the form (6.1.12), where 0 < t < d. Thus p is reducible iff the corresponding G- 
module V has a non-zero proper submodule, i.e. it is not simple. In the contrary 
case, when V is simple, the representation is called irreducible. 


6.1 Basic definitions 225 


If o can be written in the form of a diagonal sum: 


payn(P 2 ) 
0 p(x) ) 


it is said to be completely reduced. Clearly this corresponds to V being directly 
decomposable. We observe that any finite-dimensional G-module V has a composi- 
tion series: 


VSS Ve DVD Ve oss 2 Ve 0, 


such that V;_,/V; is simple. The corresponding representation can then be taken in 
the form 


p(x) = (6.1.13) 


Pr(xX) 


If p is completely reducible, i.e. we can find an equivalent representation of the form 
(6.1.13) in which * = 0, this means that the corresponding G-module is a direct sum 
of simple modules, i.e. semisimple. 

Just as for modules over a ring we can define left G-modules; they are vector spaces 
V with a G-action v > xv such that 


A460) = Oy). Iti 


However, any such left G-module may be regarded as a right G-module by defining 
v.x=x7!y(x eG). For we have v.(xy) = (xy) 'v = (yo x )y sy 7x7! v) = 
yl (vx) = (.x).y. 

In terms of the group algebra this may be expressed as follows: the group algebra 
kG has an anti-automorphism ®, i.e. a linear mapping satisfying (ab) ® = b@a®, 


given by 
(So a.x)@ = Sax! (6.1.14) 


Now any left kG-module becomes a right kG-module on putting v.a = a®v. 

We remark that the mapping defined by (6.1.14) has the property a@® = a. 
An antiautomorphism of order two is called an involution, thus kG is an algebra 
with an involution. 


Exercises 


1. Let G be a group and consider the regular representation of kG (defined by right 
multiplication). Show that this representation always has the trivial representation 
x'— 1 as a subrepresentation. 

. Let F be a field containing a primitive n-th root of 1, say w (hence of characteristic 
0 or prime to 1) and let C,, be the cyclic group of order n, with generator t. Show 
that p; : t' 1+ w*" is a representation of degree 1 of C, fork =0.1,....n—1. 


to 


226 Representation theory of finite groups 


Show that if o is any representation of C,, over F, then p(t) can be transformed to 
diagonal form. Deduce that p can be written as a diagonal sum of the px. 

3. Show that if po is any representation of a group G, then p® defined as 
p%(x) = p(x')! is again a representation of G (it is called the contragredient 
of G). What are the conditions for p® to coincide with p? 

4. Let G be a group and 9, 6 representations of G, where 6 is of degree 1. Show that 
6p is again a representation of G, and this is irreducible iff ¢ 1s. 

5. Let p be an irreducible representation of a (finite) p-group G, acting on a vector 
space V over a field k of characteristic p. Show that V contains a vector v 4 0 
which is fixed under the action of G. Show that the matrices p(x) (x € G) have 
a common eigenvector and deduce that p must be the trivial representation. 
(Hint. Use the fact that p(x) — 1 is nilpotent; a matrix p(x) with this property 
is called srpotent. ) 

6. Let G be a p-group and ka field of characteristic p. Suppose that G acts transitively 
on a finite set S and let V be a k-space on S as basis, with the G-action defined by 
the permutations of S. Show that V has a unique maximal submodule; deduce 
that V is indecomposable, but not simple, unless it is one-dimensional. 

7. Let p be an irreducible representation of degree d of a finite group G over a field 
of characteristic p. Show that any normal subgroup of index p“ lies in the kernel 
of p. Deduce that if p is faithful, then G has no non-trivial normal p-subgroup. 


6.2 The averaging lemma and Maschke’s theorem 


For a closer study of representations we need to assume that our group is finite and 
we shall make this assumption from now on. The first important fact to note is that 
for a finite group every representation over a field of characteristic 0 is completely 
reducible. 


Theorem 6.2.1 (Maschke’s theorem, 1899). Let G be a finite group and k a field of 
characteristic O or prime to the order of G. Then every representation of G over k is 
completely reducible. 


Proof. Let p be a representation of G and suppose that p is reduced: 
p(x) 0 
p(x) = ( | ; ) (6.2.1) 
A(x) p(x) 


where p', 9 are subrepresentations of degrees d’, d" respectively. To establish com- 
plete reducibility it will be enough to find a d” x d’ matrix yz such that 


i 0 16 et ew 0 ) 
Ax) p'(x)/\p I) \w il 0 p'(x)) 


When we multiply out, only the (2, 1)-block gives anything new: 


A(x) = wp (x) — p"(x)u, (6.2.2) 


6.2 The averaging lemma and Maschke’s theorem 227 


and we shall complete the proof by finding a matrix yu to satisfy this equation. By 
substituting from (6.2.1) in the relation p({xy) = p(x) e(y) we obtain the following 
equation for 6(x): 


A(xy) = A(x)p'(y) + p (x)A(y). (6.2.3) 


Writing |G| = m and noting that m 4 0 in k, by hypothesis, we have 


= Laeene'o 


=) [elxy) - p(X) @0)e'U |). 


Put z = xy; then y-' = z~ 'x, and as y runs over G, so does 2, for fixed x. Hence we 


can rewrite this last sum as 
Yaz) p = Due" (XJO(y) (3 ae 
Ne) 
mO(x) = Y-O(2)p (2° ')p"(x) — — A y)p'(y—!). 


and this has the form (6.2.2), if we abbreviate mm”! pty) p ty aS it P| 


In view of its importance we shall give a second proof of this result, or rather, 
restate the same proof in module terms. The essential step is a lemma which is 
also used elsewhere, but first we shall need to introduce some notation. If U, V 
are G-modules over k and @ is a mapping from U to V, we shall write 
a:U—>,V,a:U—->,V to indicate that @ is k-linear or a G-homomorphism 
respectively. The space of all k-linear mappings from U to V is denoted by 
Hom,(U.V) and the subspace of G-homomorphisms by Hom;,(U. V). 

In the next lemma we shall (exceptionally) write mappings between right G- 
modules on the right, so that fora : U > ,V the condition for a G-homomorphism 
is that 


(ux)@ = (ua)x forall awe UL x eG. 


Lemma 6.2.2 (Averaging lemma). Let G be a finite group and k a field of characteristic 
0 or prime to |G|. Given any two G-modules U, V and a: U — ,V, the mapping 


a’ 2 ui {iG|! >- ((ux | )a)x (6.2.4) 


x 
is a G-homomorphism from U to V. Moreover, 


(i) if a@ is a G-homomorphism, then a* = a. 
(ii) ifa:U—>,V.B:V—> GW, then (a@p)* = a* B, 
(iii) ifa:U > V.B: V > .W, then (a@B)” = af. 


228 Representation theory of finite groups 


Proof. Let us fix a € G and write y = xa, x = ya~'. Then as one of x, y runs over G, 
so does the other. Now for a: U — ;V we have 


|G|.ua@"a = » ux 'axa = \- uay ‘ay = |G{.uaa*. (6.2.5) 


v s’ 


This shows a* to be a G-homomorphism. If @ is a G-homomorphism, each term in 
the sum in (6.2.5) is uaa = uaa, so a@* =a in this case and (1) follows. Now let 
B -Vo GW; then 


IGl.u(a@p)* = ye ux lw Bx = Sy Ux laxB = IG |.ua* B. 


Xx x 


Hence (ii) follows; (111) is proved similarly. a 


We note that if neither @ nor B is a G-homomorphism, there is nothing we can 
say. We can now prove the module form of Maschke’s theorem, which states that 
every module extension splits, or equivalently, that the group algebra kG is semi- 
simple. 


Theorem 6.2.3 (Maschke’s theorem, form 2). Let G be a finite group and k a field of 
characteristic 0 or prime to |G|. Then kG is semisimple. 


Proof. We shall show that every (finite-dimensional) G-module is semisimple, or 
equivalently, that every short exact sequence of G-modules 


py ss 6, (6.2.6) 


splits. Such a sequence certainly splits as a sequence of k-spaces, for this just means 
that V’ as k-subspace of V has a vector space complement. Thus we have a k-linear 
splitting map y: V ~ V’. We have ay = 1,-; therefore 1 = 1* = (ay)” = ay*, and 
so y* is the desired G-homomorphism splitting the sequence (6.2.6). oi 


Exercises 


1. Let G be a finite group and V a finite-dimensional G-module over a field of char- 
acteristic prime to |G|. Show that if G acts trivially on every simple composition 
factor of V then the G-action on V is trivial. 

2. Show that for any (finite-dimensional) left G-modules U, V, Hom,(U.k) @, V = 
Hom,(U.V). 

3. For G= C, = gp{t|t? = 1} and k = F, define a two-dimensional space V with 
basis ¥}, v2 as G-module by wf = ¥; + 42. vot = 2. Verify that V is not semi- 
simple; calculate the corresponding representation. 

4. Show that the infinite cyclic group has, over a field of characteristic 0, a faithful 

two-dimensional representation which is not completely reducible. 

Let G be a finite group and k a field of characteristic dividing |G}. Show that the 

element z = )_.x is central and nilpotent in kG. Deduce that kG is not semi- 

simple. 


mn 


6.3 Orthogonality and completeness 229 


6. Let k be a field of characteristic p and G a finite group. Show that for any element 
g of p-power order, g — | is nilpotent in kG. Deduce that for a finite p-group G 
the radical of kG is the augmentation ideal. (Hint. Find a basis of nilpotent 
elements for the radical and use Theorem 5.5.4.) Deduce further that kG is 
completely primary. 


6.3 Orthogonality and completeness 


The representation theory of finite groups was developed by Georg Frobenius in the 
1880s and 1890s, using the determinant of the matrix of a general group element in 
the regular representation. Issai Schur in his dissertation in 1901] greatly simplified 
the theory by using his lemma, in the form given below, and the averaging 
lemma, Lemma 6.2.2. 


Lemma 6.3.1 (Schur’s lemma). Let R be any ring and U, V two simple R-modules. 
Then 


(1) Hompr(U,V) = 0 unless U & V, 
(ii) Endg(U) is a skew field. 


Proof. (i) If f : U — V is a non-zero homomorphism, then ker f is a proper sub- 
module of U, and hence is 0, while im f is a non-zero submodule of V and so Is 
equal to V. Thus f is an isomorphism, as claimed. 

(ii) When V = U, this argument shows that every non-zero endomorphism of U 
is an automorphism, and (ii) follows. | 


When the ground field k is algebraically closed, every matrix over k has an eigen- 
value, so for each automorphism f of U there exists A € k such that f — 4.1 is non- 
invertible, and hence zero, i.e. f = 4.1. This proves the following sharper form of 
Lemma 6.3.1: 


Lemma 6.3.2 (Schur’s lemma for algebraically closed fields). Let k be an algebrai- 
cally closed field, A a k-algebra and U, V any two simple A-modules, finite-dimensional 
over k. Then 
k if USV 
Hom,(U, V) = | Ci (6.3.1) 
0 otherwise. 


Let G be a finite group; we shall define an inner product on the group algebra kG by 


the rule 
(So a(x)x. yb) = |G|~ ie ')b(x). (6.3.2) 


It is clear that this product is bilinear; it is not symmetric, but satisfies the equation 


(guf = (fe re? ). (6.3.3) 


230 Representation theory of finite groups 


where ® is the involution defined by (6.1.14). The product is regular, ie. non- 
singular: if (f, x) = 0 for all x € G, then f(x” ') = 0 for all x € Gand so f = 0. Of 
course from the point of view of the inner product (6.3.2) the multiplication on kG is 
immaterial, and kG may be thought of as the space of all k-valued functions on G. 

Our next aim is to show that the different representation coefficients, regarded as 
functions on G, are orthogonal. 


Theorem 6.3.3 (Orthogonality relations for representations). Let G be a finite 
group and k an algebraically closed field of characteristic 0 or prime to |G|. If p.o 
are irreducible representations of degrees c, d respectively, then we have the relations 


0 if p and o are inequivalent, 
)Opy(x) = 4 1 (6.3.4) 
|G| ai m d bjpdig if po. 


Thus different representation coefficients are orthogonal. We note that the alterna- 
tives on the right of (6.3.4) are not exhaustive: the representations p,o0 may be 
equivalent but distinct. In that case (6.3.4) will not apply (but of course we can 
use (6.3.4) even then, after transforming one of p, 0 into the other). 


Proof. Take spaces U, V affording p,o with bases m,, ..,t%, ¥),....Va and let 
aj, : U — ,V be the linear mapping defined by 


UjQ)p = 0,)Vp. 
Explicitly the (7, q)-entry of the matrix for aj, is 
(Qin diy —= 6,8 py: (6.3.5) 


By nae 6.2.2, ve is a G-homomorphism from U to V, and its matrix is given by 


applying * to (6.3.5): 


IG LO Vig = > Oinl*)SniSprOrg(X) 
eee 
com ape pij(x j )O pg (x 
If 0, o are inequivalent, then a}, = 0 by Lemma 6.3.2, and this proves the first line 


of (6.3.4). 
Next we take p = o. By Lemma 6.3.2, a7, = Ajp € k, hence we have 


aS prj (%~") pg x) = |G].AjpSgi (6.3.6) 


To find 4,) we put q =1 and sum over i: 


Gl akp= DP (x) p,, (x = Mit = |G|.5jp. 


gd es. 
ip iP 
Inserting this value in (6.3.6) we obtain the second line of (6.3.4). = | 


6.3 Orthogonality and completeness 231 


To illustrate this result, let us take the trivial representation for o. Then o(x) = | 
for all x € G and we find that every non-trivial irreducible representation p of G 
satisfies 


Y- p(x) = 0 for all i,j (6.3.7) 


More generally, if is any representation, we can take it to be completely reduced, 
and we then see that (6.3.7) holds precisely when p does not contain the trivial 
representation. 

In terms of the inner product (6.3.2) the relation (6.3.4) may be written 


( pjj. Op) = (1/d )djn5,g or 0; 


this shows the p;; to be a linearly independent set of functions; in particular their 
number cannot exceed dim kG = |G]. Hence we have 


Corollary 6.3.4. The coefficients of inequivalent irreducible representations of a finite 
group G are linearly independent; hence the number of such representations is finite, and 
if their degrees are d,,....d, then 


t 
yd} <IGl. g (6.3.8) 
} 


Our next task is to show that equality holds in (6.3.8) if we take enough represen- 
tations. This means that every k-valued function on G can be written as a linear 
combination of irreducible representation coefficients; this is expressed by saying 
that these coefficients form a complete system of functions on G. 

To see this, let us go back to the group algebra kG. We have seen that this is semi- 
simple, hence a direct product of full matrix rings over skew fields. But as we saw, the 
only skew field finite-dimensional over k is k itself, because k is algebraically closed. 
Thus kG is a direct product of full matrix rings over k: 


kG = T] much. (6.3.9) 


re] 


Here each factor provides an irreducible representation of G, of degree d;, and these 
representations are inequivalent because the product is direct, so the coefficients 
corresponding to different factors are linearly independent. On counting dimensions 
in (6.3.9) we obtain the desired equality (first obtained by Frobenius in 1896): 


> od} = |G\. (6.3.10) 


Moreover, a comparison with (6.3.8) shows that the set of representations provided 
by (6.3.9) is complete, so s = t. Of course we can also take the regular representation 
of G, i.e. we take KG as G-module under right multiplication by G. Each irreducible 
representation p; occurs d; times, representing the d; rows of the corresponding 
matrix. Thus we again obtain the equation (6.3.10). 


232 Representation theory of finite groups 


Let G be a group and U, V any G-modules. Then U @ V may be defined as a 
G-module by the equation 


(71 @vig = ug @ vg. 


Since the right-hand side is bilinear in « and y, this defines an action and U ® V is 
easily verified to be a G-module. If the representations afforded by U, V are p.o 
relative to bases 14,,..., Um. V1,.-..Vpy respectively, then 


(uj @ vp)g= Y— pij( 8) opy(g)u; ® V4: 


hence the representation of G afforded by U @ V is the tensor product of the 
matrices: 0 ® o. When G is finite, then by Maschke’s theorem, each representation 
is a diagonal sum of irreducible ones, and if o;,...,, are all the inequivalent 
irreducible representations of G, it is enough to determine the products p, ® 9j. 
We have 


p, ® pj = Ds. Siik Pk: (6.3.11) 
k 


where the g,;, are non-negative integers, indicating how often a given representation 
px occurs in the tensor product. 
For each representation p of G we define its kernel as 


K = {x € G|p(x) = 1}. 


Thus K is a normal subgroup of G and G has a faithful irreducible representation iff 
some irreducible representation of G has a trivial kernel. This need not be the case, 
but at any rate we have 


Theorem 6.3.5. For any finite group G the intersection of the kernels of all the 
irreducible representations over a field of characteristic zero or pritne to |G| is trivial. 


Proof. The regular representation of G is faithful; since it is a direct sum of 
irreducible representations, the conclusion follows. ea 


Exercises 


1. Find all irreducible representations of Sym; by reducing the regular representation. 
. Show that for any representation p of a finite group G the set N= 
{x € Gldet p(x) = 1} is a normal subgroup of G with cyclic quotient. 

3. Let G be a finite group, k an algebraically closed field of characteristic 0 and p, 0 
inequivalent irreducible representations of G of degrees c, d. Show that for any 
cx d matrix T we have }>. p(x~')To(x) = 0. Further show that for a dx d 
matrix P we have }°. o(x~!)Po(x) = d7!.|G|.Tr(P).I. 

4. Show that if p),...., are irreducible pairwise inequivalent representations of a 
group and p = @c,¢,, then the centralizer of o has dimension }°c;. Use this fact 
to obtain another proof of (6.3.10). 


to 


6.4 Characters 233 


5. Let G be a finite group and let d be the degree of an irreducible representation of 
G over Z, i.e. a homomorphism G > GL,(Z). Show that every prime dividing d 
must also divide |G}. 

6. Show that the only matrix of order 2 in SL2(C) is —J. Show that for any integers 
k,m,n> 1 there is a group gp{a, b|(ab)* = a™ = b" = 1}. (Hint. Find a faith- 
ful representation p in PSL2(C) with tr p(a) =A+A7', tro(b)=ut+yu,” 
tr p(ab) = v+v—', where A, u,v are primitive 2m-th, 2n-th, 2k-th roots of 1 
respectively, and take p(a) upper and p(b) lower triangular.) 

7. Show that g(g) = {{(x. vy) € Gx Glg =x~ 'y~'xy} is a character, where (x, y) is 
the form defined by (6.3.2), and that g = 5° |G|x;/d;, where the xj. dj; are as in 
Corollary 6.3.4. 

8. Prove the converse of Schur’s lemma: If every endomorphism of a G-module V 
(over C) is scalar, then V is simple. 

9. Show that each irreducible representation is contained in the regular representa- 
tion. Deduce that the number of isomorphism types of irreducible representations 
of a finite group is finite. 


6.4 Characters 


A one-dimensional representation is also called a linear character or simply a 
character if we are dealing with an abelian group. Such characters have already 
been discussed in section Section 4.9 of BA. We recall that an irreducible representa- 
tion of an abelian group over C (an algebraically closed field) is necessarily one- 
dimensional, by Schur’s lemma. For a non-abelian group there will always be 
irreducible representations of degrees greater than 1 (see Proposition 6.4.2 below) 
and the definition then runs as follows. Given any representation p of a group G 
over C, its character is defined as 


x(x) =tr p(x), xeEG, (6.4.1) 


where tr denotes the trace of the matrix p(x); thus if p(x) = (o;;(x)), then 
tr p(x) = >>, pj;(x). When x and 9 are related as in (6.4.1), ¢ is said to afford the 
character x. For example, any representation of degree 1 is its own character; in 
particular, the function x;(x) = 1 for all x € Gis the character afforded by the trivial 
representation,, and is called the trivial or also the principal character. 

Some obvious properties of characters are collected in 


Proposition 6.4.1. The character of a representation is independent of the choice of 
basis, i.e. equivalent representations have the same character. Moreover, each character 
is a Class function on G, 1.e, It is constant on conjugacy classes: 


] 


XV WISH KOs. VSG. (6.4.2) 


The degree of x is x(1) and for any x € G of order n, x(x) is a sum of n-th roots of 1. 
If P\, P2 are any representations with the characters x,, x. then the characters 
afforded by p, ® p, and r, @ ra are x; + x2 and x, x2 respectively. 


234 Representation theory of finite groups 


Proof. Let x be the character of 0; any equivalent representation has the form 
T~'o(x)T and since tr(BA) = tr(AB) for any square matrices of the same size, we 
have 


(For) S18 ple): 


so both p and T'pT afford the same character. For the same _ reason, 
tr p(y 'xy) = tr( ply)” p(x) p(y) = tr p(x), and (6.4.2) follows. x(1) equals the 
degree because char C = 0, while A = p(x) satisfies A" = I if x” = 1. Thus A satisfies 
an equation with distinct roots and so can be transformed to diagonal form over C; 
its diagonal elements A again satisfy A” = 1, so they are n-th roots of 1, and x(x) is 
the sum of these diagonal elements. 

The final assertion follows because tr(A@®B)=trA+trB and tr(A @®B)= 
tr A.tr B. pe 


The next result may be regarded as a generalization of the fact that a finite abelian 
group is isomorphic to its dual. 


Proposition 6.4.2. For any finite group G the number of linear characters is (G: G’), 
where G' 1s the derived group. Hence every non-abelian group has irreducible represen- 
tations of degree greater than ]. 


Proof. Every homomorphism @ : G > C%* corresponds to a homomorphism from 
G/G’ to C* and conversely. But we know from Theorem 4.9.1 of BA that the 
number of such homomorphisms is |G/G'| = (G: G’), so the result follows by 
(6.3.10). Ss 


In Section 6.3 we defined an inner product on kG; we shall now see how in the case 
of k = C we can define a hermitian inner product on CG. Let us put 


(f.g) =IG\>' Do flxig(x). (6.4.3) 


Since every character a is a sum of roots of 1, we have @(x) = a(x~'). Hence for 
characters the formula (6.4.3) can also be written 


(a, 8) =|G|~! Yo atx” ') B(x): (6.4.4) 


so in this case it agrees with the inner product introduced in Section 6.3. From the 
orthogonality relations in Theorem 6.3.3 we obtain the following orthogonality 
relations for irreducible characters, by putting j = i, q = p in (6.3.4) and summing 
over 1 and p: 


l ifx=y. 


| (6.4.5) 
0 otherwise. 


Ow =| 


Thus under the metric (6.4.4) the irreducible characters form an orthonormal 
system. Suppose that all the irreducible representations of G are p,..., po, with 


6.4 Characters 235 


characters x;..... x,. Any representation of G is equivalent to a direct sum of 
irreducible ones, by complete reducibility (Theorem 6.2.1): 


P=V1Pi DM... DV Pr. 
Hence its character is y = vx, +...+,x,, and here v; is given by 
vi = (X, Xi), 


using (6.4.5). This shows that any representation of G (over C) is determined up to 
equivalence by its character. For example, the regular representation p(x) : a! ax 
has the form x = @d;p; and so we again find 


IG} = wl) = od? 


The inner product (6.4.4) can also be expressed directly in terms of the modules 
affording the representation: 


Proposition 6.4.3. Let G be a finite group and U, V any G-modules over an algebrai- 
cally closed field k of characteristic 0, affording representations with characters a. B 
respectively. Then 


(a. B) = dim,(Hom,,(U, V)). (6.4.6) 


Proof. Suppose first that U, V are simple. Then by Lemma 6.3.2 the right-hand side 
of (6.4.6) is 1 or 0 according as U, V are or are not isomorphic, and this is just the 
value of the left, by (6.4.5). Now the general case follows because every G-module is a 
direct sum of simple modules. ca 


The number on the right of (6.4.6) is also called the intertwining number of U 
and V. 

Above we have found the character of the regular representation, and it is not 
difficult to obtain an explicit expression for it. Sometimes we shall want an expres- 
sion for the character of the representation afforded by a given right ideal of the 
group algebra. Since the latter is semisimple, each right ideal is generated by an 
idempotent, and the next result expresses the character in terms of this idempotent. 


Proposition 6.4.4. Let G be a finite group, A = kG its group algebra and I = eA a right 
ideal in A, with idempotent generator e = )° e(x)x. Then the character afforded by I is 


Xe) Y e(xg™ me 
more generally, we have for any b € A, 


0) Yo elxv “be )b(v). 


XY 


236 Representation theory of finite groups 


Proof. Consider the operation p(b) : al eab, representing the projection on I 
followed by the right regular representation. We have 


a.p(b) = eab= So e(uv lw )a(w)b(v).u. (6.4.7) 


Now x.p(b) = 3/, Pxy(b)y, by expressing p in terms of the natural basis, hence on 
putting a= x,u=y in (6.4.7), we find 


Pay(b) = Do e(ye 7x7" )B(y). 


Therefore the character afforded by I is 


x(b) = tr olb =) pel) B e(xv”'x~")b(¥), a 


So far we have regarded the characters as functions on G, but as we saw in 
Proposition 6.4.1, they are really class functions and we may regard them as 
functions on the set of all conjugacy classes of G. We shall now interpret the ortho- 
gonality relations in this way and in the process find that the number of irreducible 
characters of G is equal to the number of conjugacy classes. 

Our first remark is that a(x)(x€G) is a class function iff the element 
a =) a(x)x lies in the centre of the group algebra. For we have 


y lay=)oa a(x)y~ Ae) '\z for all y € G; 


hence 
y ‘ay=a forallyeGealyzy ')=alz) forall y,zeG. 


Thus an element a = ) > a(x)x lies in the centre of kG iff a(x) is constant on con- 
jugacy classes. This just means that we can write a = )°a;c;, where c,, is the sum 
of all elements in a given conjugacy class C;.. It follows that these class sums form 
a basis for the centre of kG. This proves the first part of our next result: 


Theorem 6.4.5 Let G be a finite group and k an algebraically closed field of character- 
istic 0 or prime to |G|. An element a = 9° a(x)x of kG lies in the centre if and only if 
a(x) is a class function. Moreover, the class sums c;, form a basis of the centre of kG and 
the irreducible characters over k form a basis for the class functions; thus if x\...., Xr 
are the different irreducible characters, then any class function a on G may be written in 
the form 


a= > (xi. ex. (6.4.8) 


Hence the number of irreducible characters equals the number of conjugacy classes 
of G. 

To complete the proof we denote the number of irreducible characters by r and 
the number of conjugacy classes by s; as we have seen, s is the dimension of the 


6.4 Characters 237 


centre of kG. Now kG is the direct product of r full matrix rings over k. Clearly each 
matrix ring has a one-dimensional centre and the centre of the direct product is easily 
seen to be the direct product of the centres. Hence the centre of kG is 
r-dimensional over k, and it follows that r = s. The characters are independent, by 
the orthogonality relation (6.4.5), hence they form a basis, which is orthonormal 
by (6.4.5), and we therefore have (6.4.8). P| 


Let us consider the multiplication table for the basis c,, ..., c, of the centre of kG. 
Any product of classes C;,C,, is a union of a number of classes, hence we have 


CC), == > Vo pevrrs (6.4.9) 


where the y;,,,, are non-negative integers. If p is any irreducible representation of G, 
then p(c,) = n,J by Schur’s lemma, where 77, € k. Let x denote the character of p, d 
its degree and write h;, = |C,,{. Taking traces in the last equation and writing x (,.) for 
the value of x on C,, we find 


h, =7,d. (6.4.10) 


If we apply p to (6.4.9) we obtain 9;.7,, = >> ¥;.0m. and it follows that 7,, is a root of 
the equation 


det(xI —T,,) =0, where D, = (Vij). (6.4.11) 


This shows n,, to be an algebraic integer. Further, (6.4.10) shows that for each 
irreducible character x, hx") /d is a root of (6.4.11); since (6.4.11) is of degree r, 
its roots are the values h,, ou /d; for the different irreducible characters of G. 

As a consequence of this development we can show that the degrees of the irredu- 
cible representations divide the group order. We recall from Section 9.4 of BA that 
the sum and product of algebraic integers are again integral. 


Proposition 6.4.6 (Frobenius). For any finite group G the degree of each irreducible 
representation over C divides |G]. 


Proof. Let x be an irreducible character and d its degree. As a sum of roots of 1, x is an 
algebraic integer, and so is h,, xb) /d, as we saw above. By the orthogonality relations 
(6.4.5) we have 


bx 4) _ ICI 
mag) = II 


. 


and since sums and products of algebraic integers are integral, it follows that |G|/d 
is integral, as we had to show. | 


238 Representation theory of finite groups 


We note that the relations (6.4.5) can be written |G|~' >, ei ca 105s 
where h;, = |C;|. This tells us that the r x r matrix ({h; /|G|]' =) is unitary; hence 
sO is its conjugate transpose and we have 


] > 1/2, )/2 514) Ge) _ 9 
|G | h, Zs XG Xj —_ apes 
1 


When A # pz, we can omit h,, and so we obtain the second orthogonality relation for 
characters: 


Proposition 6.4.7. For any finite group G, if rae is the value of the 1-th irreducible 
character on the conjugacy class C;, and |C,,| = h,, then 


(3) (.)  [IGlI/i iA =p. 
LK ={; if Az p. a 


The character of a representation may also be used to describe its kernel. 


Proposition 6.4.8. Let G be a finite group and p a representation of G over C with 
character x. Then the kernel of r is determined by its character and 1s given by 


K, = {x € G|x(x) = x(1)}. 

Proof. Denote the degree of p by d, so that x(1) = d. If x € G has order n, then x is 
represented by a matrix p(x), whose eigenvalues are n-th roots of 1. Thus we have 
x(x) =o, +...+@,;. where wv? = |. 

If x(x) = x(1) = d, it follows that 
d= |a,+...+@4| < lay] +... + lou] = 4. 


Hence equality must hold, which is possible only if w, =...—=, and since 
>" @w, = d, each w, must be 1. Since p(x) satisfies an equation with distinct roots, 
it can be transformed to diagonal form. Hence p(x) =I and so x is in the kernel 
of p. The converse is clear. | 


Combining this result with Theorem 6.3.5, we obtain 


Corollary 6.4.9. Let G be a finite group. Given x € G, if x(x) = x(1) for all irreducible 
characters x of G over C, then x = 1. ao 


As a final result on characters we give a formula for characters of permutation 
modules. By a permutation module tor a group G we understand a G-module V 
with a basis B, the natural basis, on which G acts by permutations. The character 
afforded by V, x(x), is just the number of points of B fixed by x € G. Suppose 
that G is transitive on B and let H be the stabilizer of a point p € B. Then each 


6.4 Characters 239 


point of B is fixed by |H| elements, for its stabilizer is conjugate to H. Recalling the 
orbit formula from BA, Section 2.1: |B] = (G: H), we therefore have 


yO SB a HIG): 
For general permutation modules we can apply the result to each orbit and obtain 


Proposition 6.4.10. Let G be a finite permutation group acting on a set B and denote 
the character afforded by the corresponding permutation module by x. Then 


>: x(x) = 1.|G], 


where n is the number of orbits of B under the action of G. Bi 
An elaboration of this result allows us to evaluate (y. x): 


Theorem 6.4.11. Let G be a transitive permutation group, denote by H the stabilizer of 
a point and let r be the number of double cosets in the decomposition G = UHs;H. If x 
is the character afforded by the corresponding permutation module, then 


(XX) = IGE DU x(a)? =r. (6.4.12) 


Proof. Let G act on B and consider the action of H on B. We may replace B by 
the coset decomposition with respect to H: G=UHx,. Here each orbit of H 
corresponds to a double coset Hs;H. Now take a € G, say a = hhs;h’, where h. h’ € H. 
If x(a) = t, then a leaves t cosets Hx, fixed, i.e. a € x, 'Hx; for t values of 4. Now 
the action of x. 'Hx; on B yields r.|H| for the sum of its characters, by Proposition 
6.4.10. There are (G: H) such conjugates, so taking all these characters, we obtain 
r.jH|.(G: H)=r.]G|. The character sum includes each value x(a), and if x(a) = t, 
this term occurs t times, so in all we obtain >> x(a)’, i.e. (6.4.12). 


Examples 


We end this section with some examples of representations and characters; through- 
out, k is algebraically closed of characteristic 0. 


1. Let A be a finite abelian group. Then kA is a commutative semisimple algebra, 
hence a direct product of copies of k, and all its irreducible representations are 
of degree 1. We take a basis a,,....a,, of A (in multiplicative notation), where 
a; has order n,, and denote by ¢; any primitive n,-th root of 1. Then for any 
integers V}..... Uy, the mapping 


a; Ua... urpls 
ay + @,, eee y Ste are 


37! 


ur, V,, 


is a representation. We get distinct characters for n; different values of 1;, and the 
1, ..."y, Characters so obtained are all different and constitute all the irreducible 
characters of A. This corresponds to the fact that the dual of A, i.e. its group of 
characters, is isomorphic to A itself (see Theorem 4.9.1 of BA). 


240 Representation theory of finite groups 


2. Consider D,,, the dihedral group of order 2m, with generators a, b and defining 
relations a” = 1, b> = 1, b~'ab = a7 '. Every element can be uniquely expressed 
in the form a*b’, where 0 < a < m, 0 < B <2. It is easily verified that the con- 
jugacy classes are for odd m: {a’.a°"} (r=1,..., (m— 1)/2), {1}, {a%b}; and for 
even m:{a'.a~"} (r=1,...,m/2—1), {1}, {a}, {a°®}, {a°4+!b}. Further it 
may be checked that the index of the derived group in D,, is 2 when m is odd 
and 4 when #1 is even. 

To find the representations of D,, we have a homomorphism D,, — C; 
obtained by mapping ai— 1, which gives rise to two representations, the trivial 
representation and ai->1, bi» — 1. Further representations are obtained by 
taking a primitive m-th root of 1, say w, and writing 


w' 0 0 1 
a(® Jb ) (ee Ti ncay | ip2|): (6.4.13) 


w ! 0 


When m1 is odd, we thus obtain (+ — 1)/2 irreducible representations of degree 2 
and two of degree 1, and this is a complete set, because the total number is 
(m—1)/2+2, which is the number of conjugacy classes. We also note the 
degree equation 


— 


(m —1).27 + 2.17 = 2m. 


tr | 


When m1 is even, (6.4.13) becomes reducible for i = m/2 and we obtain two more 
representations of degree 1, ait> — 1, bi +1. We now have m/2 + 3 classes and 
the degree equation becomes 


(; = 1) 2 +4.1° = 2m. 

3. Character tables for the symmetric groups Sym3, Sym,. In the tables below the 
rows indicate the different characters, while the columns indicate the conjugacy 
classes, headed by a typical element and the order of each class. As is well 
known and easily checked, the conjugacy class of each permutation is determined 
by its cycle structure, hence the number of conjugacy classes of Sym, is the 
number of partitions of n into positive integers. Moreover, the derived group is 
the alternating group Alt, and its index in Sym, is 2; hence there are just two 
linear characters, the trivial character and the sign character. 


J 6 8 6 3 
(1234)  (12)(34) 


6.5 Complex representations 241 


In each table the first row is the trivial character and the second row the sign 
character. The degrees in the first column are found by solving the degree 
equation 5- d; = n!. Each character x gives rise to a ‘conjugate’ character xx>, 
corresponding to the tensor product po ® p2 of the representations. Thus x; 
and x, in the second table are conjugate and x; in the first table and x5 in the 
second table are self-conjugate, hence they vanish on odd classes. Now the 
remaining values are found by orthogonality, using Proposition 6.4.7. 

We note that the characters for Sym3, Sym, are rational. This is a general 
feature of symmetric groups; in fact, as we shall see in Section 6.6, all the irredu- 
cible representations of Sym, can be expressed over Q. 


Exercises 


l. 


re) 


Verify the calculations in the above examples, and make a character table for the 
quaternion group of order 8. 


. Show that any two elements x, y of a group G are conjugate iff x(x) = x()’) for all 


irreducible characters x of G. 


. Show that any character which is zero on all elements 4 1 of G is a multiple of the 


regular representation. 


. Show that if S = {x).....: x, } is a transitive G-set, then the vector space spanned 


by the x, — x, is a G-module affording a representation which does not contain the 
trivial representation. 


. Let x be a character of a finite group over a field of characteristic 0. Show that 


(x. X) 18 a positive integer. 


. Let p be an irreducible representation of degree d of G, with character x. Show 


that the simple factor of the group algebra corresponding to p has the unit 
element e given by |G|.e = d. ¥° x(x~ ')x. 


. If gjj, is defined as in (6.3.11), show that gijx = (xix). Xx). Use Proposition 6.4.7 to 


evaluate the sum of all the g*, and deduce the formula )°,, an = 1G) ai 
Show that the value of this sum for Sym; is 11 and for Sym, is 43, and verify 
the formula in these cases. 


. (Theodor E. Molien, 1897) Let G be a finite group with irreducible characters 


) Gree x, and U a G-module with basis u),....u,. Show that the character 
afforded by the G-module U@U ®@...@U (n factors) is 5° 1;x;, where n; is 
the coefficient of t" in |G|~! ee xi(x) det(l — tp(x))~ ae being the representa- 
tion afforded by U. Deduce that the number of invariants of degree n in the w’s 1s 
the coefficient of t” in |G[~' $7. det(I — tp(x))'. 


6.5 Complex representations 


When we restrict our representations to be over the complex numbers, several 
simplifications can be made. In the first place we can then confine attention to uni- 
tary representations; secondly, by examining the different types of irreducible repre- 
sentations we obtain an estimate for the sum of their degrees in terms of the number 
of elements of order 2 in the group. 


242 Representation theory of finite groups 


Throughout this section we only consider complex representations, thus we take C 
as the ground field. We recall that a square matrix P over C is said to be unitary if 
PP! — I], where P = P! is the transpose of the complex conjugate of P. The d x d 
unitary matrices over C form a group, the unitary group, written U,(C). By a unitary 
representation of G, of degree d, we understand a homomorphism G — U,(C). In the 
special case where the representation is real, we have an orthogonal representation, 
because Ug(C)} M GLg(R) = O,(R), the orthogonal group. 

Any G-module U affording a unitary representation of G has a hermitian metric 
defined on it which is invariant under G. Thus there is a positive definite hermitian 
form (u,v) on U: 


QAutaAu.v) =A(u.v) +A (uv), (vou) = (uv), (uu) > 0 for u4~0, (6.5.1) 
which in addition satisfies 
(ux, vx) = (uv) forall uuveU. x EG. (6.5.2) 


For, given a unitary representation 9, relative to the basis 1,,..., vq of U, let us 
define a hermitian form by writing (¥;, v;) = 4;,. If the anes 4, v have coordinate 
rows a. f, then (u.v) =a@B" and (ux. vx) = ap(x)(Be(x)) = ap(x) p(x) pt = 
aB", because p(x)o(x)" = I by unitarity. Conversely, if we ie a positive definite 
hermitian form satisfying (6.5.2), then transformation by x preserves the metric and 
so must be unitary. 

Any unitary representation is completely reducible. To verify this assertion we take 
the corresponding module U. If W is any submodule, then its orthogonal com- 
plement W* = {ue U|(u. w) = 0 for all w € W} is again a G-submodule and is 
complementary to W. In terms of representations we can also verify this fact by 
noting that a reduced matrix p(x) must be fully reduced, because its transpose is 
ple), 

The importance of unitary representations is underlined by the following result, 
which incidentally provides another proof of Maschke’s theorem. 


Proposition 6.5.1. Every complex representation of a finite group G 1s equivalent to a 
unitary representation; every real representation 1s equivalent to an orthogonal repre- 
sentation. In particular every real or complex representation 1s completely reducible. 


Proof. Let p be the representation of G and U a G-module affording p. We have to 
find a positive definite hermitian form on U which is invariant under G. Take any 
positive definite hermitian form h(u, 1) on U and define 


(1.1) => h(x. vx) 


reEG 


For any a € G we have 
(ua, va) a h(uax. vax) = ) h( uy, vy) = (u, 1). 
: 


Thus (1, v) 1s invariant under G, and it is clearly positive definite hermitian, as a sum 
of such forms. On choosing an orthonormal basis, we obtain the desired unitary 


6.5 Complex representations 243 


representation; starting from a real representation, we obtain an orthogonal repre- 
sentation of G. Now the last part follows by the earlier remarks. a 


In dealing with unitary representations, we usually restrict equivalence to be by 
unitary matrices. Thus two unitary representations p, 0 are unitarily equivalent if 
a(x) = Pp(x)P~' for a unitary matrix P. For irreducible representations this is auto- 
matic, for if o,o are equivalent, we have o(x)S = Sp(x) for an invertible matrix S. 
Taking hermitian conjugates, we have Soa(x)" = p(x)"S", hence 


A(x)S"S = p(x)S"a(x)#o(x)S = p(x)o(x)"S4Sp(x) = S™Sp(x). 


Since p is irreducible, we have $S = AI by Schur’s lemma, and here A > 0, because 
SS is positive definite. Writing A = u- (uw > 0) and T = 4 ~'S, we obtain a unitary 
matrix T such that o(x) = Tp(x)T™!. 

The irreducible complex representations may be classified as follows. Let p be an 
irreducible complex representation (not necessarily unitary). If o is equivalent to a 
real representation, it is said to be of the first kind. If p is not of the first kind, 
but is equivalent to its conjugate 9, it is said to be of the second kind; in the remain- 
ing case p is of the third kind. Our next result shows how to distinguish the first two 
of these cases. In the proof we shall need the elementary fact that a symmetric unitary 
matrix can always be written as the square of a symmetric unitary matrix. We recall 
the proof. 

Let P be symmetric and unitary; as a unitary matrix P is similar to a diagonal 
matrix, say S'PS = D for a unitary matrix S, and 


sps |'=P=P! =(s') 'ps'. 


i.e. S'SD = DS'S. Now D is diagonal and again unitary, so its diagonal elements 
have absolute value 1 and we can find a diagonal matrix E such that E~ = D and 


S'SE = ES'S. (6.5.3) 


Put Q = SES~'; then Q is unitary, because E is, and Q° = SE-S~! = P. Moreover, 
Q?’=(S")~'ES' = SES”! by (6.5.3), hence Q! = Q, so Q is also symmetric. 


Proposition 6.5.2. Let p be a complex irreducible representation of a group G such that 
Ox) =P ' o(x)P forsome P € U,(C). (6.5.4) 


Then either P’ = P and p is of the first kind, or P' = —P and p is of the second kind. 


Proof. Taking complex conjugates in (6.5.4) we have p(x) = P! p(x)(P!)~ ', because 


P-! =p" = P': therefore 
P'P~' p(x) = P! p(x)P7! = p(x)P'Po!, 


hence P'P~'=2I, so P' =iP. Transposing, we find P=AP! =A-P, hence 
A“ = 1 and so A = +1. This shows that either P’ = P or P! = —P. 


244 Representation theory of finite groups 


Now it is clear from (6.5.4) that o cannot be of the third kind, so it is enough to 
show that p is of the first kind iff P! = P. If p is of the first kind, then there is an 
invertible matrix L such that L~'p(x)L is real for all x € G, hence by (6.5.4), 


L-ol(x)L+ Lo o(x)L = L~'P~ | p(x) PL. 


It follows that PLL~' commutes with p(x) and so PLL=! = al, ie. P = aLL~'!. Now 
if P’ = —P, then P~! = P4 = —P and so I= PP~! = —aLL™!.@LL~! = —adl, 
which is a contradiction. Therefore P! = P in this case. 

Conversely, assume that P’ = P. Then P is symmetric unitary and by the above 
remark, P = Q-, for a symmetric unitary matrix Q. Hence Q= QO"? =Q7!, and 
Q-!p(x)Q = QP 'p(x)PQ°' = Q7!p(x)Q. Therefore p is equivalent to a real 
representation, and so is of the first kind. o 


If in Proposition 6.5.2, o is of the second kind and its degree is denoted by d, we 
have P? = —P, hence det P = ( — 1)" det P, and it follows that ( — 1)" = 1. Hence d 
must be even and we have 


Corollary 6.5.3. Any complex irreducible representation of the second kind has even 
degree. | 


For any character x of a finite group G let us define its mdicator as 
TP Okt (Clam Be eal (6.5.5) 
xEG 


The three kinds of irreducible representation may be described in terms of the 
indicator: 


Theorem 6.5.4. Let x be a complex irreducible character of a finite group G and \{x) 
its indicator as in (6.5.5). Then x is of the first kind if v(x) = 1, of the second kind if 
W(x) = —1 and of the third kind if v(x) = 0. 


Proof. Let p be a representation affording y and denote its degree by d. We may take 
p to be unitary, and then have 


|G |.v{x) = os p(x") = > Y— pil) yi) 
1X 1} x 


= ~ a Pij(X) p(x as 
x 


If p is of the third kind, p and pf are inequivalent and then v(x) = 0 by the ortho- 
gonality relations (Theorem 6.3.3). If p is of the first kind, we may assume it to 
be real; in that case 


G1.) = 2 pilxrpilx”') = 2 84:8-1G 1/4 = IG. 


6.5 Complex representations 245 


and hence v(x) = 1. Finally, if p is of the second kind, then by Proposition 6.5.2 
there is a unitary matrix P such that P! = —Pand P~!p(x)P = p(x); hence on writ- 
ing P = (pj;,), P~’ = (qj;) we have, again by the orthogonality relations, 


v(x) = |G | : Z 2 > ij (X) ir Pre(X~ ) Dei ae Sgr si5r5is 


irs ox 
- | . } 
= d GirPir 


: =F 
=—d-'tr(P-'P’) eee 


d 


Thus v(x) = —1, as we wished to show. EI 


—=-—] 


Finally we show how the indicator is related to the solution of the equation x" = a 
in G, 


Proposition 6.5.5. Let G be a finite group. Given a € G, let t(a) be the number of 
solutions of the equation x- =a in G. If x,,.... X, are all the inequivalent complex 
irreducible characters of G, then 

t(a) = > v(xi) xia). (6.5.6) 


a | 


Proof. It is clear that t(a) is a class function on G, so it can be written in the form 
t(a) = }°¢xAa), and it only remains to show that c; = v(x;). The sets T(a) = 
{x € G|x° = a} form a partition of G, and we have by Theorem 6.4.5, using (6.5.8) to 
determine the coefficient c;: 


IG). = So tla)xila) = SY xi) 
eT(a) 


aeG veh y 
af ig 


_ OR = 16. 


x€G 


by (6.5.5). Since the indicator is always real, the desired relation c; = Wf Xj) 
follows. oO 


Let us apply the result for a = 1. In this case x(1) = d is the degree of x. Bearing 
in mind that v(x) < 1, with equality precisely when is of the first kind, by Theorem 


6.5.4, we obtain 


Corollary 6.5.6. If t is the number of elements of order 2 in the finite group G, and the 
degrees of its complex irreducible representations are d,,...d,, then 


t+1=) vlxld,<) a. (6.5.7) 


with equality if and only if all the irreducible characters are of the first kind. o 


246 Representation theory of finite groups 


As an example consider D,,,, the dihedral group of order 21, where mm is odd. We 
saw that there are (m — 1)/2 characters of degree 2 and two linear characters, and 
there are m elements of order 2, namely ab in the notation of Section 6.4. The 
inequality (6.5.7) in this case reads 


m+1< ((m—1)/2).24+2.1. 


Since equality holds here, all the representations must be of the first kind, and going 
through the proof of Proposition 6.5.2, we find that the representation given at the 
end of Section 6.4 is equivalent to the real representation 


ie ae ae ky ee QO | 
al at ) bi ( ) p ieee 6) Aa fea re (m1 — 1)/2 
Ne = Yes 1 O 


As a second example consider the degrees of the irreducible representations of Syms. 
We shall see in Section 6.6 that all irreducible representations of the symmetric group 
can be taken to be rational (even integral). The elements of order 2 in Syms are trans- 
positions or products of two transpositions. There are 5.4/2 = 10 transpositions (1 7) 
and 10.3/2 = 15 products of the form (i )(k 1); hence t = 25, t+ 1 = 26. To find 
the number of conjugacy classes we count the partitions of 5: 
(5). (4.1). (3, 2). (3, 17), (2, 18). (27.1). (1°). So we have to look for seven positive 
integers, divisors of 5! = 120, whose sum is 26 and whose squares have the sum 
120, by (6.3.10). Proposition 6.4.2 tells us that just two of the numbers are 1, because 
(Sym, : Alts) = 2, and this leaves d)..... ds such that 


5 


Sieve Sod; = 118. 
] 


] 


Further, we can rule out the low dimensions 2, 3 because an integral representation 
can be reduced mod 2, and we cannot have a homomorphism of Syms into GL>(F;) 
(order (2- — 1)(2° — 2) = 6) or into GL3(F>)) order (2° — 1)(2? — 2)(2? — 2°) = 
192) which maps Alts non-trivially. This leaves as the only possibility for the degrees 
1, 1, 4, 4, 5, 5, 6. 


Exercises 


1. Use the information on Sym; to make a character table for it. 
. Find the matrix of transformation which reduces the representation (6.4.14) for 
D,, to the form given above. 

3. Let V be a simple G-module with real character. Show that there is just one 
invariant bilinear form b on V, up to scalar multiples. Show further that b is 
either symmetric or antisymmetric, and that the latter can happen only when 
dim V Is even. 

4. Show that in a group G of odd order no element # | is conjugate to its inverse. 
Deduce that for any y 4 1, >, xy)" =, xily) x07!) = 0. Hence show that 
G has an irreducible character of the third kind. 


to 


6.6 Representations of the symmetric group 247 


5. Show that every non-trivial irreducible character of a group of odd order is of the 
third kind. (Hint. Use the methods of Exercise 4 and the fact that x(1) is an odd 
integer.) 

6. Find a quadratic polynomial f such that a complex irreducible representation of 
the n-th kind has the indicator f(n). For what numbering of the different kinds 
of representations can f be chosen as a linear polynomial? 


6.6 Representations of the symmetric group 


In principle all the irreducible representations of a finite group can be obtained by 
taking the regular representation and reducing it completely. This is a lengthy under- 
taking, but for certain types of groups there are more direct methods. We shall 
describe such a method for symmetric groups; it is due mainly to Georg Frobenius 
and Alfred Young, with some simplifications by John von Neumann. 

We recall that the symmetric group of degree n, Sym,, or simply S,, is the group of 


al] permutations of 1.2..... n. It has order ! and each permutation can be written 
as a product of disjoint cycles, e.g. 
‘12 3 4 5 6 7 8 9 
)=a 7 2 5)(3 6 9)(4 8). 
7568192 4 3. 
The cycles have no digits in common, so they commute and we can arrange them by 
decreasing length. If the lengths are a... . @,, we may thus suppose that 
OS ce get © ei a ees a © 4 ae (6.6.1) 
and 
OW) > Wy ec > Oh. (6.6.2) 
If X, of the a, are 1, A> are 2, etc., we can also write 17'2*:...r* (where r is the 


largest a;) for the set of a’s. This is called the cycle structure of the permutation. 
Two permutations have the same cycle structure iff they are conjugate in S,,: If 


7 —- («2 ee ye Cc ee - Ay, Sap) eee ( ee “tine 


So = (by -- 5 Dy) Orie jp in eae) ee ue 


then g =p 'fp, where p: a, b;. Hence two permutations with the same cycle 
structure are conjugate; the converse is clear. 

It follows that the number of conjugacy classes of S,, equals the number of 
sequences (d)..... a},) of positive integers satistying (6.6.1) and (6.6.2); by Theorem 
6.4.5 this is also the number of inequivalent irreducible representations of S,,. To get 
a complete system of irreducible representations of S, we need only construct for 
each sequence (q)..... a;,) an irreducible representation, such that representations 
corresponding to different sequences are inequivalent. This will be our aim in 
what follows. 


248 Representation theory of finite groups 


We shall write a | » to indicate that wa = (q)..... a) is a descending partition on 
n, as in (6.6.1), (6.6.2). To each such partition there corresponds a diagram of n 
squares arranged in h rows of a, squares in the i-th row. For example (4, 3-, 1) 
corresponds to 


This is called a Young diagram and is again denoted by a. Since a | n, we can write 
the numbers | to m in these squares (in some order). The result is called a Young 
tableau. For example 


We see that for each diagram there are n! distinct tableaux. If T, is a Young tableau 
and g € S,,, then Tyg denotes the tableau obtained from T, by applying g. 

We can let each Young tableau represent a permutation by regarding the rows as 
cycles. In this way the tableau just illustrated represents the permutation (2 6 5)(1 3). 
If T, represents ¢ in this way, then T,,g represents g~ 'cg, as is easily verified. 

We now fix a tableau T, and define two subgroups of S,, as follows: P, is the set of 
all permutations leaving each symbol in its row, briefly the set of row permutations of 
Tz, and Qa, 1s the set of all permutations leaving each symbol in its column, the set of 
column permutations of T,. Here P, and Q, depend of course on T, and not merely 
on the diagram a. For example, in the above tableau P, is generated by (2 6), (2 6 5), 
(1 3), while Q, is generated by (2 1), (2 1 4), (3 6). If we apply g to T, and use the 
remark made earlier, we obtain 


Lemma 6.6.1. Let T, be a Young tableau with groups P,, Q, and let g € S,. Then the 
groups for Tyg are g~'Pyg and g~ 'Qug. Li 


Let A be the group algebra of S,, (over the rational numbers, say) and write e(g) or 
€. for the sign of the permutation g. Writing p, q for the typical elements of P,, Quy 


6.6 Representations of the symmetric group 249 


respectively, we define two elements of A, the sum of the row permutations and the 
alternating sum of the column permutations: 


fa e ba Sa = Seq. (6.6.3) 
p Fi 


Lemma 6.6.2. Let T, be a Young tableau and f,. g, as in (6.6.3), and write |Po| = To: 


[Os = 4% Then 
bie ipa i for pe P, (6.6.4) 
Gku = Rae = Ey&u for q € Q.,. (6.6.5) 
b Sieieee Seige, (6.6.6) 


Proof. We have pfu = ) PP’ = DP’ = fus €q48u = 22 Ey&y 44 = Sa, which estab- 
lishes one half of (6.6.4), (6.6.5); the other half follows similarly. Now 


ie _ Dp Phe = ae = Vedas e = >) €y 98a — > Se = Saka: | 


Next comes a basic combinatorial lemma on which everything else depends. We 
order the partitions lexicographically, by writing a > f if the first non-zero differ- 
ence a, — f; is positive (where a or f is completed by 0's if necessary). This provides 
a total ordering of the set of all partitions of n. 


Lemma 6.6.3. Let a, B tn and take any tableaux T,.T,. If a > B and no two 
numbers in the same row of T, occur in the same column of T,, then (i) a@ = B and 
(11) T, = T,qp for some p € P,, q € Qg. 

Some care is needed here, for we have abused notation by writing P, instead of the 
more accurate P(T,,). In fact we shall take P,, Q, to be the groups associated with Ty 
and write P,,, Q, for the groups of T). 


Proof. Since a > f, we have a, > f,. The first row of T, has a, numbers, which 
must be in different columns of T,, so B; = a, and hence B; =a). Now for a 
certain column permutation q) € Q, we can bring these numbers into the top 
row, though possibly in a different order from that in Ty. 

Leaving out the top row in Tq, and I, we can repeat the argument, showing that 
B2 = a2 and finding q, such that T).q,q5 has the same numbers as T,, in the second 
row as well as the top row. After h steps (if @ has h parts) we get q’ = 4\q5..-q,, 
such that T,q’ differs from T, only by a row permutation: T,q’ = Tp, where 
p€ Py, q' € Qy. Now T, = Tyq'p™ | hence 


] ] 


=r) Op SPO pS 


it follows that q = pq’ 'p~' € Q,. Therefore qp = pq’~' and we have T, = Tu ap, 
as claimed, with p € P,, q € Qy. a 


250 Representation theory of finite groups 


Corollary 6.6.4. Given a, B | n and any tableaux T,.T, if a > B, then 
i yx =O forall xeS,,. (6.6.7) 


fuAgy = 0. (6.6.8) 


Proof. We begin by showing that 
fuofp =0 fora > B. (6.6.9) 


By Lemma 6.6.3 there must be two numbers i, k which lie in the same row of T,, and 
in the same column of Te Write tf = (1k); then fat = fa, tg = —gp, hence figx = 
fot “Zp = —fugn and (6.6.9) follows. Now x~ 'gxx corresponds to the tableau T'.x and 
(6.6.7) follows by applying (6.6.9) with g, = x" 'gsx. Replacing x7! by y and multi- 
plying by y on the right, we have f,gz = 0, and now (6.6.8) follows by summing 
over y. a 


Given a Young tableau T,, let us put 


he = feu = S eypa. (6.6.10) 


pg 
Clearly h, 4 0, because the coefficient of 1 is 1: 
ee a—- ee (6.6.11) 


We may consider hy as an operator symmetrizing the rows and antisymmetrizing the 
columns; it is called the Young symmetrizer associated with the tableau T,. 


Proposition 6.6.5. Let h,, be the Young symmetrizer associated with a tableau T,,. Then 
the relation 


pat,g=a forallpe Py. qEQ, (6.6.12) 


holds fora = h,, and anv a satisfying (6.6.12) must be of the form a = Ah, where X18 a 
scalar. Moreover, 


h,bhy =0 fora > Bandany be A, (6.6.13) 


h bh, = wh, forany be A. (6.6.14) 


Proof. By (6.6.4), Pfaga = fu. While (6.6.5) yields fo 20q = &4fu8a; hence (6.6.12) 


holds for a = h,. Now let a = }¢ a(x)x satisfy (6.6.12). Then 


S| eyalx)pxq = S| alx)x for all p € Py, g € Qy. (6.6.15) 


Comparing coefficients for x = pg, we find 


€,a = alpq): (6.6.16) 


6.6 Representations of the symmetric group 251 


we claim that a(x) = 0 when x is not of the form pq. Consider T, and T’ = Tyx~'; 
by Lemma 6.6.3 there are two numbers j, k in the same row in T, and in the same 
column in T’, (because T), is not of the form T,pq). Put t = (j k); then we have 
te P,, t€Q', where Q’ corresponds to T/; therefore t €xQ,x | or also 
x~!tx € Qy. In (6.6.15) let us take p = t, gq =x 'tx; comparing coefficients of x 
we find ( — |)a(x) = a(x), hence a(x) = 0. Together with (6.6.16) this shows that 
a=a(l). >> pge, = a(1).hy as claimed. 

Now when @ > f, then by Corollary 6.6.4, hybhp = fogebfeee € foAgs = 0. This 
proves (6.6.13), and (6.6.14) follows because h,bh, satisfies (6.6.12). o 


We can now obtain the irreducible representations of S, by expressing the group 
algebra as a direct sum of minimal right ideals. 


Theorem 6.6.6. With each Young diagram a let us associate one definite Young tableau 
Ty and construct the corresponding Young svmmetrizer h, as an element of the group 
algebra A = QS,: 


he = > epg. (p © Pu. g € Qu). 


Then the IY = h,A are simple submodules for the right regular representation of S,,; they 
are pairwise non-isomorphic and afford a complete svstem of irreducible representations 
for Sy. 


Proof. We first show that [* = h,A is a minimal right ideal of A. Given a right ideal 
m © I*, we have mh, C 1%h, © Qhy. We distinguish two cases: (1) mh, = Qhg, then 
I? =h,A =mh,A Cm, hence /” = m; (ii) mk, = 0, then m? = m/“ = mh,A = 0, 
so m~ = 0 and therefore m = 0. It follows that [* is minimal and the representation 
induced by the regular representation is irreducible. 

We next show that [*, J? are not isomorphic for a # B. If a > B, say, then by 
Corollary 6.6.4, heAh; =0, hence [%h; =0, but I%h, #0, because h, #0, and 
the conclusion follows. Now the number of distinct diagrams is the number of parti- 
tions of 7, which is just the number of conjugacy classes of S,,, as we have seen; hence 
by Theorem 6.4.5, we have a complete set of irreducible representations. oO 


It still remains to calculate the irreducible characters. By Proposition 6.6.5 we have 
he = igs (6.6.17) 


therefore e, = j4,, 'h, is an idempotent and the right ideal of A generated by e, is 
minimal. To find u, consider the operation of left multiplication by h, on I*. By 
(6.6.17) this 1s 


Ky) =e) (6.6.18) 


On the other hand, for a € 1%, aA(h,) = hoa = So hy (uv ')a(1').u, hence the matrix 
of A(fh,) (in the natural basis) is given by 


ec ay e—a e a ay 


252 Representation theory of finite groups 


and so tr(A(hy)) = do hg (ax!) = Soha (1) = $21 = a!, by (6.6.11). A comparison 
with (6.6.18) shows that #! = Lar = dyfty, where d, is the degree of the repre- 
sentation. Thus we obtain 


Pe = de. 


Recalling Proposition 6.4.4, we find that the corresponding character is given by 


(6.6.19) 
Xo (x = bebe" get): 


From this formula it is possible to calculate yx, explicitly in terms of the partitions 
a of n; we shall give the result here without proof and refer to Weyl (1939), 
Chapter VII or James and Kerber (1981), Chapter 2 for details. 

Let x € S, have the cycle structure B = (1812): since the value of a 
character x at x depends only on f, we shall write it as x(8). Let x).....: , be any 
variables and denote the Vandermonde determinant formed from them . writing 
down the terms on the main diagonal: 


Jh-dioh- 0 

Mar tne Rael 
This is an alternating function of the x’s, briefly denoted by A in what follows. More 
generally, for any positive integers a|,..., a), the function 


g a > } -~ 2 N 
ao a ee | 
formed in the same way, is an alternating function of the x’s, zero unless all the a’s 
are in descending order, and it may be written as S(a),.... a,)A, where S as a 
quotient of two alternating functions is symmetric in the x’s. Such a function S is 
called a bialternant or S-function. 


Given @=({q@),...,@;,)+ , let us write rr; =a, +(h-1), tm =at 
(h—2)....., r, = a,. We define the power sums in the x’s as s; = }° x’. and for a 
given B= ( Bjess<, B;) satisfying 5° iB; =n put o(f) = gist. ht With these 


notations we have the following relation for the characters of. Sn corresponding to 
the partitions into at most h parts: 


o(B).\x"~ xt 2). x?| = Dxe8) i eee (6.6.20) 


Thus x,(f) is the coefficient of x}'xy ...x, in the expansion of the left-hand side of 
(6.6.20). On dividing by A we may write (6.6.20) as 


o(B) = a Xo(B)S(Q). 0... Qn): (6.6.21) 
this shows that x,,(B) is the coefficient of xy'xv . oi in the expansion of the left- 


hand side of (6.6.21). Further, the degree a6 the frreducible character x, 1s given by 


6.7 Induced representations 253 


Exercises 


1. Compute the characters of $;.S; by the methods of this section. 

2. For each partition a of n define the conjugate partition as a’, where a‘ is the 
number of parts of @ that are >i. Describe the relation between the Young 
diagrams of a and a’, and show that a” =a. 

3. Show that the character corresponding to the conjugate partition is given by 
X. = €Xw where © is the sign character. 

4. Show that for n > 4, S, has at least two irreducible representations of degree 
n — 1, and show that for n = 6 it has four. 

5. A group is called ambivalent if each element is conjugate to its inverse. Such a 
group, if non-trivial, must have even order. Show that an ambivalent group has 
a real character table. 

6. Show that the symmetric group S,, of any degree n is ambivalent (it can be shown 
that the alternating group Alt, is ambivalent precisely when n = |. 2,5, 6, 10, 14; 
see James and Kerber (1981), Chapter 1). 

7. Show that any irreducible representation of S,, of degree greater than | is faithful 
except the representation corresponding to 2° for n = 4. 


6.7 Induced representations 


There are various relations between the representations of a group and those of 
its subgroups which are often needed. In the first place, if G is a group and H a 
subgroup, then each G-module V is an H-module by restriction of the action to 
H, This H-module is denoted by res}; V or res V or also Vj; if the representation 
afforded by V is p, then the corresponding representation of V is written res{} p 
Or Py. 

We next describe a process of passing from a representation of a subgroup of G 
to one of G itself. Let H be a subgroup of finite index r in G and consider a right 
H-module U. From it we can form the right G-module 


ind), U = U®% = U @y kG, (6.7.1) 


called the induced G-module. To find the representation afforded by U“ we note 
that the subspace U@H = {u@1|u€ U} of U® is an H-module in a natural 
way: for any h € H we have (u @1)h = uh @ 1. More generally we can define the 
subspace U ® Ha = {u @ alu € U} of (6.7.1) as an H“-module, where H* = a™ 'Ha, 
by the rule 


(u@a)x = uaxa '@a forxe H". 


This is well-defined because axa~' € H precisely when x € H“. When U @ Ha is 
considered as a right H“-module in this way we shall denote it by U". 
Now take a coset representation of G: 


GSA... WAL (6.7.2) 


254 Representation theory of finite groups 


With its help we can write (6.7.1) as 
US =(U@Hrt,)@...@(U @Ht,) =U" @®...@U'. (6.7.3) 


Each U" is a H'-module and under the action of G these terms are permuted 
among themselves. 

Given x € G, we can foreachA = 1,....r find a unique h € H and u,l <u <r, 
such that t,x = ht,,. Then we have 


(U@t,)x = uh @ ty: 


this shows how the action of x permutes the terms in (6.7.3), as well as acting on 
each. To find the representation afforded by U®, let us take a basis 1j)..... ,, Of 
U and let p(h) = ( p,,(h)) (h € H) be the corresponding representation of H. Then 
the elements uu, @ t, form a basis of U“ and we have, for any x € G, 


(HOt jx =u, @GX = 0 @ hij. 


where jz is chosen so that t,xt,,' = h € H. Hence we find 


(u, @t,)x = Sp, (h)u, ® fi, 


I 


Thus from a representation ( p(h)) of degree d we obtain the induced representation 
(OUtAt OF degree rd, where r=(G:H). We note that this representation 
consists of r d x d blocks, one in each row and one in each column: 


where 
Bey SP gag = 
a hei ) if bx e Hi. 
otherwise. 


This is called the induced representation and is denoted by ind}; p or p”. Hence if the 
character of p is a, then p” has the character indj, a = a”, given by 


a(x) = tr( p(x) 
= = tr Pie 
= > tr( pl txt, ") 


where the sum is over all A such that t,xt,' € H. Let us define a on G by 


| a(x) if x € H, 
a? (x)= 


0 otherwise. 


6.7 Induced representations 255 


Bearing in mind that a(h) = tr(p(h)), we can rewrite the equation for a(x) as 


Cs) liar.) (6.7.4) 


Since q@ is a class function on H, we have a(ht, xt. hey Saat ') for any he H. 


It follows that 
JH |.a% (x) = So SC at (ht, xt 'h7') 


heH +2 
2 
= a°(gxg -). 
veG 


If we define a® by a8 (x) = a®(gxg |), we can express this as a formula 


aR = eas (6.7.5) 


veG 


We remark that for any class function a on H this formula defines a class function a” 
on G. To illustrate (6.7.5), consider the case H = 1. There is only one character, 
namely the trivial character 1. We see from (6.7.1) that the induced representation 
is just the regular representation of G. More generally, let us take an arbitrary sub- 
group H of G but take the trivial character 1;; on H. Then we obtain an induced 
character 


ind))(1y(x)) = ny, 


where for any x € G. n, is the number of elements txt, | (A = 1...., r) that lie in H. 


aa 


As another example take the dihedral group of order 2m, D,, = gp{a, bla” = 
b? = (ab) = 1}, with subgroup H = gp{a} of order m. H has the representation 
aio, where w"” = |]. With the transversal t; = 1, t) = b of H in D,, we have 


at; =t)a. ato = oy, ae bis tol, Or Fpl 


Hence we obtain the induced representation 


a =(° ) m=(° ‘) 
ree Nes aed ese ie hoe 


and the character x(a) =wt+w'. y(b) =0. 
It is clear from (6.7.1) that ind}, is a covariant functor; thus 


indy} (U @ V) = ind}, U @ ind} V. (6.7.6) 


Further, if K C H CG and U is a K-module, then U @x kG = U @x kH @y kG; 
hence 


ind,’ (ind; U) & ind,’ U. (6.7.7) 
It follows that for any character w of K (indeed for any class function), 


(a7)? =a", (6.7.8) 


256 Representation theory of finite groups 


The following important relation between induction and restriction was proved by 
Georg Frobenius in 1898: 


Theorem 6.7.1 (Frobenius reciprocity theorem). Let G be a finite group and H a 
subgroup. 


(i) Given a G-module V and an H-module U, there is a natural isomorphism 
Hom,(U. resi; V) = Home(indy, U, V). (6.7.9) 


(ii) If a. B are class functions on H, G respectively and a° = indy, w is defined as in 
(6.7.5), then 


(ay, ress B) yy = = (ind?a, B)c; (6.7.10) 


where the subscripts denote the groups on which the scalar products are taken. 


For irreducible characters p on Gand o on H this tells us that the multiplicity of o 
in py is equal to that of p in o%. 


Proof. By adjoint associativity (see Section 2.3) we have 


Hom,;;(ind U. V)  Hom,(U @y kG. V) 
~ Hom;,;;(kG. Homy(U. res V )) 
~ Hom, (U.res V ). 


This establishes (6.7.9). For (6.7.10) we have 


=1G1"! > a@(x)Alx) = a °) B(x) 


xE(; 


= (GIA IO' Daf (x) B08) 


=|H|~' 3 a(x) B(x) 
x€H 
= (a. resy B)y, 
When a. f are characters, (6.7.10) is of course an immediate consequence of (6.7.9). 


Since every class function is a linear combination of characters, this provides another 
proof of (6.7.10) |» | 


We next derive a formula described by George Mackey in 1951, on the effect of 
inducing and then restricting a representation. Before stating it we recall that for 
any group G with subgroups H, K we have a double coset decomposition 


G = UHa;K (6.7.11) 


into disjoint sets. It is obtained by letting the direct product H x K act on G by the 
rule: x1 h~ 'xk for x € G, (h.k) € H x K. Each orbit has the form HaK, for some 


6.7 Induced representations 257 


a € G, and (6.7.11) is the partition of G into orbits. The element (h,k) € Hx K 
leaves a €G fixed iff h~'ak=a, i. k=a~'ha; hence the stabilizer of a is 
H“ 1K, and so we obtain the following formula for the size of a double coset: 


|HaK| = |H|.JK{/|H° OK]. 
Theorem 6.7.2 (Mackey). Let G be a finite group, H, K subgroups of G and (6.7.11) a 


double coset decomposition of G for H, K. If V is any H-module and H, = H" ™ K; 
then 


resx indy (V) = @; ind}, resy (V“). (6.7.12) 


Hence if a 1s the character of the representation afforded by V and a® is again defined by 
‘ | Y 
af(x) =a®(x*  ), for x, x8 € H, then 


resk indy, (@) = S° ind, res}; (a). (6.7.13) 


! 


Proof. Take a coset decomposition G = UHt of G for H; then ind® V = HV', and 
for fixed i, the set {Ht|t € Ha;K} is an orbit for the action of K on the coset space 
G/H. Hence we find 


ind,, V = HW, 


where W; = @{V'‘|t € Ha,K} is a right K-module. Fix a = a, and put W = W,. For 
any y.z € K we have 


Hay = Haz © yz™'’ © H"NK=D. 


say. Hence if K = UDy is a coset decomposition of K over D, then HaK = U Hay, 
and so H"K = UH“y. Now 


W @(V @ay) = OV": 
here V @a = V" is an H“-module, hence also a D-module and we have 
W = ((V")p)° = inds(V)p, 
by the definition of induced module. Now (6.7.12) follows by summing over 1 and 
restricting to K, and (6.7.13) follows by taking characters in (6.7.12). a 


Let us apply this result in the special case of K = H: 


Proposition 6.7.3. Let G be a finite group, H a subgroup with double coset decompo- 
sition G = UHa,H and @ a character of H. Then the induced character indj, a is 
irreducible if and only if @ is irreducible and (a. resy(a@))} = 0 for all a; ¢ H, 
where H; = H“ OH. 


Proof. Write B = ind;\a; by Frobenius reciprocity (6.7.10) we have 


(B. Ble = (a, resiz B) y. (6.7.14) 


258 Representation theory of finite groups 
and by Mackey’s formula (6.7.13), 


: 4H a, 
res Bp = ) ind,, resy (a). 
i 


Applying Frobenius reciprocity once again, we have 


a, 


(a. indy; res; (@));, = (resy (a). a@ 


say. Substituting into (6.7.14), we obtain 
(B. Bo =o di. 


If the double coset H = HH is represented by a), then d; = (a, a) => 1. For 6 to be 
irreducible, we must have (8, B);; = 1, i.e. d; = 1 and d, = 0 fori > 1, and these are 
just the conditions stated. | = | 


In particular, when H is normal in G, then A; = H, resy @ = @ and we find 


Corollary 6.7.4. Let G be a finite group and Ha normal subgroup. For any character a 
of H, ind? a is irreducible if and only if w is irreducible and different from a” for all 


a Z H. Ea 


Exercises 


1. Show that if N <G and @ is a class function on N, then indy, w vanishes outside N. 
. Let a be a character of a subgroup H of G and define a” as in (6.7.5). Show 
directly that (a, x),; is a non-negative integer for every irreducible character x 
of G, and deduce that a is a character of G. 
3. If a® denotes the contragredient of p (see Exercise 3 of 6.1), show that 
(ee Suey 
4. Given a group G with subgroup H and characters a. B on H, G respectively, show 
that indy? (a, besp pp) = (indy a) B. 
5. Given a group G and subgroups H, K, show that if a, 6 are characters of H, K 
respectively, then 


to 


(indy; @, indy B) = Yo (a*. Byes xe 


where x ~ 'y runs over a set of double coset representatives. 

6. (A. H. Clifford) Let G be a finite group and H a normal subgroup. Show that if V 
is a simple G-module and U a simple H-submodule of V, qua H-module, then Ux 
is a simple H-module for each x € G. Deduce that V is semisimple as H-module. 
Show also that for each simple H-type a (i.e. isomorphism class of simple sub- 
modules) the a-socle has the same length. 


6.8 Applications: The theorems of Burnside and Frobenius 259 


6.8 Applications: The theorems of Burnside and Frobenius 


In this section we shall apply the earlier work to prove some results of pure group 
theory. In the first place we have the p%q”-theorem of Burnside, which states that 
any finite group whose order is divisible by only two distinct primes is soluble. 
This was proved by William Burnside in 1904 using character theory; for odd 
primes a purely group theoretic proof was found by John Thompson about 1960, 
but this is far more complicated. 

Let G be a finite group; we denote its conjugacy classes again by {C;,} and denote 
the sum of the elements in C;, (in the group algebra kG) by c;, and their number by h. 
We recall from (6.4.10) that for any irreducible character x of degree d, 


h, x? = nid. (6.8.1) 


where 7; is given by p(c,) = h,/, p being the corresponding representation. Further 
we saw in Section 6.4 that both x‘”? and h, are algebraic integers. 


Lemma 6.8.1. Let G be a finite group, p a complex irreducible representation of G of 
degree d with character x and let C, be a conjugacy class of G with h, elements. If h; 
is prime to d, then either x vanishes on C;, or p(x) = ¢.I for all x € C,, for some c € C. 


Proof. We have yi) = =w, +...+w,, where each @; is a root of 1; hence |x (.)) < < d, 
with strict inequality unless all the w, are equal. . the latter case the elements of C;, 
are represented by scalars; otherwise we have Fae /d| < 1, and the same holds for all 
conjugates of yx over Q. Since d is prime to h,, there exist integers a. b such that 
ah; + bd = 1, and so 


xi") = (ah, + bd) x!” = adn, + bdy'\*? by (6.8.1) 


= d(an, + by'*)), 


Now both n, and yx) are algebraic integers, aa so IS x ) i d, and the same holds 
for its conjugates. Therefore the norm o 0) /d) is an integer; as we have seen, it is 
less than 1 in absolute value, so N(x\ td) = 0, and it follows that x () — 0, as 
claimed. | 


This lemma has the following important consequence. We recall that for any ele- 
ment x of a finite group G the number of conjugates of x in G is the index (G: C,), 
where C, is the centralizer of x in G (see BA, Section 2.1). In what follows we under- 
stand by a simple group a simple non-abelian group. 


Theorem 6.8.2 (Burnside). If a finite group G has a conjugacy class C,, consisting of 
p'" elements, where m > 1 and p is prime, then G carinot be simple. 


Proof. Let d,.....d, be the degrees of the irreducible representations of G; we have 
seen in (6.3.10) that 


ya 216). (6.8.2) 


260 Representation theory of finite groups 


and here the right-hand side is divisible by p, because p” = |C;| is the index of a 
centralizer. On the left of (6.8.2) we have d, = 1, so two cases can arise. Either 
there are more linear representations of G, in which case by Proposition 6.4.2 
their number is (G: G’), where G’ is the derived group, hence G’ # G and so G 
is not simple; or all the non-trivial representations have degree greater than 1; 
then by (6.8.2) there must be a representation p; of degree d, prime to p. By 
Lemma 6.8.1, either x” = 0 or p;(x) for x € C,, is a scalar. In the latter case either 
0; 1s not faithful or x lies in the centre of G; each time it follows that G is not simple. 
This only leaves the alternative ro = 0. Thus for any character x, we have either 


x = 0) or ro = 0 (mod p), except for j= 1, when yj) = re = 1. By ortho- 
gonality we have )>/ 1 hence 1=0O (mod p), which is a contra- 


diction. + | 


Now it is an easy matter to deduce the solubility criterion: 


Theorem 6.8.3 (Burnside). Every group of order p"q”, where a. B > 1 and p, q are 
primes, is soluble. 


Proof. Take a simple factor H of order p*q’; by Theorem 6.8.2 no conjugacy class 
can have prime power order, hence a.b > 1 and moreover, any conjugacy class 
has order divisible by p and q. The class equation for H reads 


pig’ =1+ D0 pq”. 


where a,b and all the a,,b, are positive. This is a contradiction and the result 


follows. ae 


Secondly we give a result proved by Georg Frobenius in 1901 on the existence of 
complements in a group. If G is any group and H, K are subgroups such that 
HK = 1, HK =G, then each of H, K is called a complement of the other; this 
term is used particularly when one of H, K is normal in G. The main result is 
preceded by a lemma which is needed for the proof. 


Lemma 6.8.4. Let G be a finite group and H a non-trivial subgroup such that 
H* OH =1 for all x ¢ H. If @ is a class function on H such that a(1) =0 and 
— ind? a, then (a@°),, = @ and for any class function B on H, (a, B")., = (a. B)y. 
Proof. We have w°(1) = (G: H)a(1) = 0, by hypothesis. Let us fix h € H, where 
h £1. Then by definition, we have 


a(h) =|H|~' So a(h*). (6.8.3) 
xeEG 
Consider a term in this sum; if a(h*) 40, then h* € H* NH, so x €H and 
a(h*) = a(h). Thus all the non-zero terms in the sum on the right of (6.8.3) are 
a(h), and there is a term for each x € H. Hence 


6.8 Applications: The theorems of Burnside and Frobenius 261 
a"(h) = \H}~! S¢ ath’) = a(h), 
xcH 


and this proves the first assertion. For any other class function 6 on H, Frobenius 
reciprocity gives (a@”. B°)¢ = ((a@"),,. B)y = (a. 8), by what has been proved. @ 


The actual result of Frobenius can be stated in two equivalent forms, in terms of 
permutation groups or abstractly. 


Theorem 6.8.5 (Frobenius). Let G be a finite transitive permutation group such that 
no element #1 has more than one fixed point. Then the elements without fixed 
point, together with I, form a normal subgroup N of G. 

The normal subgroup N is called the Frobenius kernel. We remark that if H is the 
stabilizer of a point, then 


NH =G 


by transitivity (see the proof below), while N MH = | holds by hypothesis. Thus N 
is a complement of H, and this theorem may also be stated in the following form: 


Theorem 6.8.6 (Frobenius) . Let G be a finite group with a subgroup H such that 
A*NH=1 forallx ZH, (6.8.4) 


i.e. H meets each of its conjugates in 1 and is its own normalizer. Then H has a normal 
complement in G. 

A group G with these properties is called a Frobenius group and the subgroup H 
with the property (6.8.4) is called a Frobenius subgroup of G. So Theorem 6.8.6 
states that every Frobenius subgroup has a normal complement. 


Proof. Let us first show the equivalence of Theorems 8.5 and 8.6. Under the hypo- 
thesis of Theorem 6.8.5 let H be the stabilizer of a point. If H aG, then all points 
have the same stabilizer. In that case H = G or H = 1 and the conclusion follows 
with N = 1 or G respectively, so we may exclude this case. Further, the conditions 
of Theorem 6.8.5 mean that any x € G\H moves the point fixed by H to another 
point, not fixed by any element in H”* (where H* = H\{1}), so that H* OH = 1, 
while the elements moving all points make up precisely the set G\ U H*. Conversely, 
given the hypotheses of Theorem 6.8.6, we see that on taking the coset representation 
on G/H that no element ¥ 1 fixes more than one point and the elements moving all 
points comprise the complement of the union of all the conjugates of H in G. If they, 
together with 1, form a subgroup (and this is what we have to prove), then it must be 
a complement of H in G. 

Let us denote this set, namely (G\ U H*) U {1}, by N. It is clear that N is a normal 
set (Le. a union of conjugacy classes), and writing |G| = n, |H| =m, n = mr, we 
have |N| =n—r(m-—1) =r, so if N is a subgroup, it will be a complement of H 
because r = (G: H) and clearly HM N = 1. So all we need to show is that N is a 
subgroup. We shall obtain it as the kernel of a certain representation. 


262 Representation theory of finite groups 


Let @ be any class function on H and put a = a — @(1).14, where 1); is the trivial 
character on H. Then a@(1) = 0, and @ is a class function on G. Put 


a* = a@° + a(1).1e. (6.8.5) 
Then 


(aa )G = (AO. AG + 2a(L)M(@%, Leg +a)? 
= (&. @)y + 2a(1)(a, ly)y tal)”. 
by Lemma 6.8.4. Hence 


(@*.a*)o =(@t+a(lly,a@+a(1)ly)y = (a. a) y. 


If @ is an irreducible character, then a” is either a character or a difference of 
characters of G, i.e. a virtual character, and moreover (a*,a*);: = (@.@)}, = 1, so 
either ~w* or —a* is an irreducible character of G; in fact it must be a*, because 
a*(1) = @°(1) + a@(1) > 0. Further we have by Lemma 6.8.4, 


resj;@" = resyat+a(l).ly =at+a(l)ly =a. 


Now consider K = ker a*, where the intersection is taken over all irreducible 
characters a on H and ker indicates the kernel of the corresponding representation. 
Then K <G and for any irreducible character a on H and x € HMK we have 


O(x) =a" (XO). =e (1): 
hence HMK =1 by Corollary 6.4.9. Since K is normal in G, we also have 
H* 1K = 1 for all x € G, and so K CN. Here equality holds, for if x € N7, then 
x ¢ H* for all y € G, hence a@°(x) = 0 for all irreducible characters a of H. So in 
that case a@*(x) = a*(1) by (6.8.5), and so x € K by Proposition 6.4.8. This shows 
that N = K and it proves N to be a subgroup. fa 


So far no proof of this result without representation theory is known, although 
special cases (e.g. for soluble groups) have been proved in this way. Burnside has 
shown that the Sylow p-subgroups of the Frobenius subgroup are either cyclic, or 
in case p = 2, generalized quaternion groups, and Thompson has shown that the 
Frobenius kernel is nilpotent (see Huppert (1967)). 


Exercises 


1. Let the finite group G be a split extension of N by a subgroup H. Show that H is a 
Frobenius subgroup iff H acts freely on H*~ = H\{1}, ie. for any x E N7%, 
he H*,thenx'’ $x. 

2. Verify that in a dihedral group of order 2m, where m is odd, the cyclic subgroup 
of order m is a Frobenius kernel. 

3. Let G bea finite group with Frobenius subgroup H. Show that under the action of 
H on the coset space G/H there is one orbit of length 1, while all the others have 


6.8 Applications: The theorems of Burnside and Frobenius 263 


. Show that if a conjugacy class C of a non-trivial group G has p 


length |H|. Deduce that |H| divides (G : H) — 1 and that the Frobenius kernel N 
has order prime to its index. 

’” elements, where 
m > 1 and p is prime, then the set C” 'C generates a proper normal subgroup 


of G. 


. Give a direct proof of Theorem 6.8.5 in the case where the set acted on by G has 


p' elements, where m > 1 and p is prime. 


. Show that every normal Hall subgroup (i.e. of order prime to its index, see 


Section 3.2) is characteristic in G (i.e. admitting all automorphisms of G). 


Further exercises on Chapter 6 


l. 


to 


Let G be a soluble group of order divisible by a prime p. If N = G;_,/G; is a 
chief factor of order p" of G (see Section 3.1), show that the action of G on N 
induced by inner automorphisms is a representation of degree r over the field 
F, of p elements. 


. Show that an irreducible representation of a p-group over a field of characteristic 


p is necessarily trivial. Find a (reducible) representation of degree 2 of C, over F, 
which is faithful. 


. Show that any representation of a finite group G over Q is equivalent to a repre- 


sentation over Z. (Hint. Take a basis (u;) of the representation module, let N be 
a common denominator for all the representation coefficients and consider the 
subgroup A of }° Zu; generated by all the expressions Nu;x (x € G); show that A 
is again a G-module and find a Z-basis for it.) 


. Show that in an irreducible representation of a finite group G over an algebrai- 


cally closed field k, each element of the centre of G is represented by a scalar 
matrix. Show that this holds even if k is not algebraically closed but contains 
a primitive |G|-th root of 1. 


. Let G be a group with a faithful irreducible representation over a field of 


characteristic 0. Show that the centre of G is cyclic. (Hint. Use Exercise 4 to 
prove first the special case where the ground field contains a primitive |G|-th 
root of 1.) 


. Show that for a permutation representation of a group G with character x the 


number of orbits is (x. 1;;), where 1,5 is the trivial character. 


. Show that if a character x of a faithful representation assumes r distinct values 


on G, then the Vandermonde determinant (x'* ') is non-zero. Deduce that every 
irreducible character is a constituent of at least one of 1,;, x, x~...-. ye, 


. Show that a doubly transitive permutation representation (i.e. transitive on the 


pairs) is a sum of the trivial representation and one other irreducible representa- 
tion. (Hint. The stabilizer must be transitive; now use Theorem 6.4.11 and the 
orthogonality relations.) 


. For each representation p(x) = (pjj(x)) of degree d define d- elements of the 


group algebra as pj; = >— p;;(x)x. Show that the orthogonality relations take 
the form: for irreducible representations p.o that are inequivalent, 0;;o), = 0, 
while PijPpq = (|G|/d)d;,pj,- 


264 


10. 


. Let G be a finite group and X = (x; 


Representation theory of finite groups 


Let G be a finite group. Show that if every irreducible representation of G over C 
is 1-dimensional, then G is abelian. (Hint. Diagonalize a matrix of the regular 
representation. ) 


. (J. A. Green) Show that if V is a simple G-module over a finite field F, 


then E = End,,(V) is a finite field. Further, if |F| = g, then V@V®...@V 
(n factors) has exactly (q" — 1)/(q — 1) simple submodules. 


. Show that the affine group A = Aff,(F,) of all transformations x!> ax + b 


(a,b € F,, a # 0) is a Frobenius group with the translation group as kernel and 
the stabilizer of 0 as complement. 

(%)) its character table. Show that any auto- 
morphism of G induces a permutation of the rows of X, and a permutation of 
the columns, where the effect of wa on X is defined by x#(x) = x,(x*). If Pla), 
Q(a) are the permutation matrices describing the effects of a on the rows and 
columns of X respectively, then X* = P(a)X = XQ(qa). Deduce that P(a), 
Q(a) are conjugate and hence have the same trace. Hence show that for any 
group A of automorphisms acting on G the number of orbits of the set of 
irreducible characters is the same as the number of orbits of the conjugacy classes 
of G under the action of A. 


. Let G be a Frobenius group with kernel K and complement H. Show that for any 


non-trivial irreducible character a of K, ind? a is an irreducible character of G. 
(Hint. Consider the group of automorphisms of K induced by H; show that 
c€H* fixes no conjugacy class # {1} of K, and use Exercise 13 to show that 
for any non-trivial character x of K, x‘ # x; now apply reciprocity to show 
that ind? @ is irreducible.) 


. Let G, H, K be as in Exercise 14. Show that any irreducible character x of G is 


either trivial on K or induced up from an irreducible character w of K. 
Deduce that for such x. w and any c € H, resi; x(c) = 6.) W(1).|A |. 


. Let H be a subgroup of S,, and for h € H denote by F;, the set of numbers left 


fixed by h. Show that y(h) = |F,| — 1 is a character of H. (Hint. Let S,, act by 
permutations on a basis u,.....u,, of an n-dimensional vector space V and 
take a decomposition of V including the subset spanned by }- u,.) 


Noetherian rings and 
polynomial identities 


The Artinian condition on rings leads to a very satisfactory theory, at least in the 
semisimple case, yet it excludes such familiar examples as the ring of integers. 
This ring is included in the wider class of Noetherian rings, which has been much 
studied in recent years. We shall present some of the highlights, such as localization 
(Section 7.1), non-commutative principal ideal domains (Section 7.2) and Goldie’s 
theorem (Section 7.4), and illustrate the theory by examples from skew polynomial 
rings and power series rings in Section 7.3. 

Another condition which helps to make rings amenable is the existence of a poly- 
nomial identity. The topics treated include generic matrix rings and central polyno- 
mials (Section 7.7) and the theorems of Regev (Section 7.6) and Amitsur (Section 
7.8), as well as some generalities in Section 7.5, while Kaplansky’s basic theorem 
on Pl-rings is reserved for Chapter 8. 


7.1 Rings of fractions 


In BA, Section 10.3 we constructed a ring of fractions Rs for a commutative ring R 
with respect to the multiplicative subset S. This construction can be carried out quite 
generally, as we shall see in Theorem 7.1.1 below, which should be compared with 
Theorem 10.3.1 of BA. It is not even necessary to assume R commutative. But for 
the result to be of use, we must have a means of comparing expressions in Rx and 
this will require further hypotheses to be imposed. This will be done in Theorem 
7.1.3 below. For a general treatment it is necessary to invert matrices rather than 
just elements, but this will not be needed here (see Cohn (1985)). 

Given any ring R and a subset X of R, we say that a homomorphism f : R > R’ 
is X-inverting if it maps each element of X to an invertible element of R’. For fixed 
R and X, the pairs (R. f), where f is X-inverting, can be regarded as objects in a 
category whose morphisms are commutative triangles, and an initial object in this 
category is called a universal X-inverting homomorphism. An element c in a ring R 
is called left regular if xc = 0 implies x = 0, i.e. c is not a right zero-divisor; right 
regular elements are defined similarly, and a regular element is one which is left 
and right regular. 


266 Noetherian rings and polynomial identities 


Theorem 7.1.1. Let R be any ring and S a subset of R. Then there exists a ring Rs and a 
homomorphism 4. R —» Ry which is universal S-inverting. 


Proof. In detail the assertion means that A is S-inverting and every S-inverting 
homomorphism can be factored uniquely by A. To construct Rs we take for each 
a €Ra symbol p, and for each s € S an additional symbol q, and form the ring 
Rs on all these symbols as generators, with the defining relations 


Pi = 1. Pa + Po = Pat b> PaPb = Dabs Ps4s = 4sPs = 1. forall a.be R,s ES. (7.1.1) 


The first three sets of equations ensure that the mapping A: a! p, is a homo- 
morphism of R into Rs, while the fourth shows that A is S-inverting. Now given 
any S-inverting homomorphism f : R — R’, we can define a homomorphism g of 
the free ring F on the p’s and q’s into R' by the rules p, !—> af, q<'> (sf) |. It is 
clear that g preserves the relations (7.1.1); hence it induces a homomorphism 
g,:R— RY’, and it is easily checked that Ag) =f. Moreover, g; is uniquely 
determined by this equation, since its value on p, is given, while its value on q; is 
determined by (7.1.1): if q.g) =c, then sf.c = (p.q:)e, = 1 = (qsp.)gi = c-sf, 
so c= (sf)- | Thus g, is determined on a generating set of Rs; and hence is 
unique. + | 


Corollary 7.1.2. Let R,R’ be rings with subsets S.S' respectively. Then any homo- 
morphism f : R—» R’ which maps S into S' can be extended in a unique way to a 
homomorphism of Rs into R\.. 


ll * ; ; 
Proof. The composition R —> R' —> R\¥ is S-inverting and so can be factored 
uniquely by A: R — Rx to give the required homomorphism f) : Rs > R¢.. oo 


The ring Rs constructed in Theorem 7.1.1, together with the homomorphism 
A: R- Rs 1s called the universal S-inverting ring for R or also the localization of 
R at the set S. To study it in more detail we need to make some simplifying assump- 
tions. A very simple but most fruitful idea, due to Oystein Ore [1931] (and indepen- 
dently, to Emmy Noether, unpublished) is to look at the case where all the elements 
of Rx can be written as simple fractions (aA.)(sA) ~ ' If this is to be possible, we must 
in particular be able to express (sA) ~ '(ad) in this form: (sA) 7 /(aA) = (a,A)(s)A)7', 
hence on multiplying up, we obtain 


(as) )A = (sa, )A. (Fale) 


and this provides a clue to the condition needed. Of course, if every element is to be 
expressed as a fraction with denominator in S, we must also assume S to be multi 
plicative, i.e. to contain 1 and be closed under products. 


Theorem 7.1.3. Let R be a ring and S a multiplicative subset of R such that 


O.1 aSNsR# © for allae R.s €S, 
O.2 for eachae Rs €S, if sa =0, then at = 0 for some t € S. 


7.1 Rings of fractions 267 


Then the elements of the universal S-inverting ring Rs; can be constructed as fractions a/ 
s, where 


a/s=a'/s’ &au=a'u and su=s'u’' €S forsome u,u ER. G71.3) 
Moreover, the kernel of the natural homomorphism X.: R — Rs is 
ker A = {a € Rjat = 0 for some t € S}. (7.1.4) 


Condition O.1 is called the right Ore condition. A multiplicative subset S of R satisfy- 
ing O.1 is called a right Ore set; if S also satisfies O.2, it is called (right) reversible or 
also a right denominator set. Such a set allows the construction of right fractions 
a/s =(ad)(sk)~', by Theorem 7.1.3. They must be carefully distinguished from 
left fractions (sd) '(aA). By symmetry we have the notion of a (reversible) left 
Ore set, which allows us to construct all the elements of R as left fractions, and 
the set S in Theorem 7.1.3 may well be a right but not left Ore set. 


Proof. The proof of Theorem 7.1.3 is similar to the commutative case (BA, Theorem 
10.3.1), though more care is needed, owing to the lack of commutativity. Guided by 
(7.1.3), we define a relation on R x S by writing 


(a.s) ~~ (a’.s’) & there exist u,u’ € R such that au =a‘u’,su=su €S. 


We claim that this is an equivalence. Clearly it is reflexive and symmetric. To prove 
transitivity, let (a.s) ~ (a’.s’), (a'.s')~(a".s"); say au=a'u’, si=s'u'e S, 
av=a'y',s'v=s"y' eS. By O.1 there exist z € S, z’ € R such that s'u’z = s'vz, 
hence s’u’z € S (by multiplicativity) and moreover, s’(u'z — vz’) = 0, hence by 
O.2 there exists t €S such that u'zt = vz't. Now we have auzt=a'u'zt = 
a‘yz't=a'v'c't, suzt = s'u'2zt = s'vz't = s"v'z't, and this lies in S because s'u’z € S 
and t € S. Thus (a.s) ~ (a, s"). 

We thus have an equivalence on R x S; let us write a/s for the equivalence class 
containing (a, s) and call a the numerator and s the denominator of this expression. 
We note that (7.1.3) now holds by definition, and it may be interpreted as saying that 
two fractions are equal iff when they are brought to a common denominator, their 
numerators agree. Of course it follows from O.1 that any two expressions can be 
brought to a common denominator. For this reason we can define the addition of 
fractions by the rule 


a/s+b/s=(a+b)/s. (7.1.5) 


Here it is necessary to check that the expression on the right depends only on a/s, b/s 
and not on 4, b, s, a task which may be left to the reader. To define the product of a/s 
and b/t we determine 6, € R and s, € S such that bs; = sb, and then put 


(a/s)(b/t) = ab) /ts;. 


Again the proof that this is well-defined is left to the reader. Now it is easy to check 
that with these operations the classes a/s form a ring T say, and the mapping ai a/1 
is an S-inverting homomorphism from R to T. Moreover, if f : R— R’ is any 
S-inverting homomorphism, then the mapping f| : R x S > R’ given by (a, s}i> 
(af )(sf)~' is constant on each equivalence class and so can be factored via T 


268 Noetherian rings and polynomial identities 


to provide a homomorphism f’: T — R’ such that f = Af’. Here f’ is unique, 
because it is determined on a/1 and 1/s, so by uniqueness T is indeed the universal 
S-inverting ring. Finally, ker A consists of all a/1 = 0/1, i.e. by (7.1.3), all a such that 
at = 0 for some te S. + | 


An important case is that where S lies in the centre of R. Then O.i—O.2 are auto- 
matic and we have 


Corollary 7.1.4. Let R be a ring and S any multiplicative subset of the centre of R. Then 
S is a reversible Ore set and the universal S-inverting ring R consists of all fractions a/s 
(ae R.s € S), where 


a/s=a'/s' (as —sa')t =0 forsome ré S. Fs | 


The conditions of Theorem 7.1.3 simplify slightly when S consists entirely of regular 
elements. Then O.2 is superfluous and ker A = 0. We state this as 


Corollary 7.1.5. Let R be a ring and S a right Ore subset of regular elements. Then the 
natural homomorphism A: R — Rg is injective. + | 


The subset T of all regular elements in R is always multiplicative and satisfies O.2. 
When it satishes O.1, we can form R,; this is called the total (or classical) quotient 
ring. Generally one understands by a quotient ring a ring in which every regular 
element is a unit. 

Finally we note the special case of integral domains. 


Corollary 7.1.6. Let R be an integral domain such that 
aR bR+0 forall a.be R”. (7.1.6) 


Then R is a regular Ore set, K =Rpr- is a skew field and the natural mapping 
4: R— K 1s an embedding. Conversely, if R is an integral domain with an embedding 
in a skew field whose elements all have the form ab~' (a,b € R,b #0), then (7.1.6) 
holds. | 


The skew field K occurring here is called the field of fractions of R. The special case 
(7.1.6) of O.1 was used by Ore (1931] in his proof of Corollary 7.1.6; since then there 
have been many papers generalizing Ore’s construction to the case of Theorem 7.1.3 
or a special case. An integral domain satisfying (7.1.6) is called a right Ore domain; 
left Ore domains are defined similarly and an Ore domain is a domain which 1s left 
and right Ore. 

The following property of fractions is often useful. 


Proposition 7.1.7. Let R be a ring with a reversible right Ore set S and universal 
S-inverting ring Rs. Then any finite set in Rs can be brought to a common denominator. 


Proof. We shall use induction on the number of elements. Let a;/s; € R 
(i—1,...,#) be given. For » = 1 there is nothing to prove, so by induction we may 


7.1 Rings of fractions 269 


assume the fractions to be in the form a, /s,, a2/s..-... a,/s. By O.1 there exist t € S, 
cé€R such that sc=s,;t=wueS, hence the fractions can be written aj,t/u, 
arc/U,..., Anc/U. + | 


In any integral domain R (commutative or not) the additive subgroup generated 
by 1 is a commutative subring, necessarily a homomorphic image of Z, hence it is 
either Z or Z/p for some prime p. Accordingly we say that R has characteristic 0 
or p; this generalizes the customary usage for fields. If K is any commutative ring, 
a K-algebra A is said to be faithful if the natural homomorphism K — A is injective. 
Thus we can say that every integral domain can be regarded as a faithful K-algebra, 
where K = Z or F,, 

Suppose now that R is an integral domain, hence a faithful K-algebra (K = Z or 
F,), but not right Ore. Then R contains a, b # 0 such that aRM bR = 0. We claim 
that the K-subalgebra of R generated by a and b is free on a, b. For let f(x,y) be 
a non-zero polynomial in the non-commuting variables x, ) with coefficients in K, 
such that f(a, b) = 0, and choose f to have the least possible degree. We can write 


fix.y) =a+xfilxy) +yfolx.y), wherea eK, 


and f). f2 are not both zero, for otherwise a = 0 and f would be the zero polynomial. 
Suppose that f; 4 0; then f,(a.b) 4 0, by the minimality of deg f, but f(a, b) = 0, 
hence f(a, b)b = 0 and so 


af\(a, b)b = b( — a — fi(a. b)b). 


and this is non-zero because the left-hand side is non-zero. This contradicts the fact 
that aRM bR = 0, and it proves 


Proposition 7.1.8. Any integral damain R 1s a faithful K-algebra, where K 1s Z or F,, 
according as R 1s of characteristic 0 or p. Further, R 1s either a left and right Ore domain, 
or it contains a free K-algebra on two free generators. 


We observe that the last two possibilities are not mutually exclusive; an Ore 
domain may well contain a free algebra. We also recall from BA, Further Exercise 9 
of Chapter 6, that an algebra containing a free algebra of rank 2 contains a free 
algebra of countable rank. 

An interesting observation made by Alfred Goldie [1958] (actually the simplest 
case of Proposition 7.4.8 below) is that the Ore condition is a consequence of the 
Noetherian condition. 


Proposition 7.1.9. An integral domain either is a right Ore domain or it contains free 
right ideals of infinite rank. In particular, any right Noetherian domain is right Ore. 


Proof. Let R be an integral domain and suppose that R is not right Ore. Then there 
exist a,b € R* such that aR bR = 0; now the conclusion will follow if we show 
that the elements b,ab.a-b,... are right linearly independent over R. Suppose 


270 Noetherian rings and polynomial identities 


that S“a'bc; = 0 and let c, be the first non-zero coefficient. We can cancel a’ and 
obtain the relation bc, + abc,., +... +a" 'be, = 0, Le. 


a(be,..; +...¢a""!~"be,) = —be, £0. 


and this contradicts the assumption on a, b. Ed 


A ring is said to be prime if R #0 and xRy = 0 implies x = 0 or y = 0. Such rings 
will be discussed in Section 8.5; for the moment let us apply the results found to 
prime rings. 


Proposition 7.1.10. Let R be a prime ring and S a right Ore set consisting of regular 
elements of R. Then R is embedded in Rs and Rs is again prime. 


Proof. The mapping R > Rs is an embedding, by Corollary 7.1.5. Suppose that 
uRv = 0, where vu = as~', v= bt! (a.be€ R,s.t € S). Then for any x € R, axb = 
as~'.sx.bt~'.t = 0, hence a= 0 or b = 0 and accordingly u = 0 or v=0. + | 


Corollary 7.1.11. Let R be a prime ring with centre C. Then C™ consists of regular ele- 
ments in R; in particular, C is an integral domain, and if its field of fractions is denoted 
by K, then R is embedded in R, - = R @; K, and the latter ring 1s again prime, with 
centre K. 


Proof. We first show that C”* consists of regular elements. If ce C” and ca =0 
(a € R), then for all x € R, cxa = xca = 0, hence a = 0. Since ac = ca fora € R, this 
shows c to be regular. Since K is universal C “-inverting, it follows that R @ K is 
universal C’ -inverting and by Proposition 7.1.10, the mapping R— Rc. = R@K 
is an embedding. Clearly K is contained in the centre of R ® K; conversely, if ac” ' 
is in the centre, then for any x € R, ax = ac” !.cx = cx.ac” | = xa, because c € C. 
Hence a € Cand soac”! EK. | 


Exercises 


]. Show that every right Artinian ring is its own quotient ring. 

Show that an integral domain in which every finitely generated right ideal is 

principal is a right Ore domain (such a ring is called a right Bezout domain). 

3. Show that in a right Noetherian ring every right Ore set is reversible. 

4, Show that for any reversible right Ore set S in a ring R, the ring R, is flat as left 
R-module. 

5. Show that for any ring R and subset S the natural mapping R — Rs is an epi- 
morphism in the category of rings. 

6. Show that the characteristic can be defined for simple rings as for integral 
domains, and that a simple ring is a Q-algebra or an F,,-algebra according as 
the characteristic is 0 or p. 

7. Let R be a simple ring and S a reversible right Ore subset. Show that the centre of 
Rs coincides with the centre of R. 


to 


7.2 Principal ideal domains 271 


8. Let R be a ring, X a subset consisting of regular elements and S the submonoid of 
R* generated by X. Show that S is right Ore provided that the Ore condition 
holds in the form xRO rS 4 @ for allx EX, r ER. 

9. Let F be the free group on x and y. Show that the group ring ZF is an integral 
domain, but not left or right Ore. 

10. Let R be a ring with right total quotient ring. Show that any right factor of a 
regular element of R is regular. 

11. Let R bea right Ore domain and K its field of fractions. Show that the centre of K 
consists of all ab~ | (b 40) such that axb = bxa for all x € R. 


7.2 Principal ideal domains 


An important example of Noetherian domains are the principal ideal domains, 
l.e. integral domains in which every right ideal and every left ideal is principal. By 
Proposition 7.1.9 and Corollary 7.1.6 every principal ideal domain (PID for short) 
can be embedded in a skew field. It follows that for a square matrix over a PID a 
left (or right) inverse is in fact two-sided, since this holds over a skew field. 

In BA, Section 2.4 we saw that every finitely generated abelian group can be 
written as a direct sum of cyclic groups of prime power orders. A corresponding 
result can be proved more generally for finitely generated modules over a (commu- 
tative) principal ideal domain. It depends on the fact that every square matrix over 
such a ring is associated to a diagonal matrix; this is known as the Smith normal 
form, which is well known for Z and more generally, any Euclidean domain. Surpris- 
ingly there is an analogue for non-commutative PIDs, with a rather strong unique- 
ness condition which is sometimes useful. To state the result, let us define an 
invariant element in a ring R as a regular element c such that cR = Rc, thus the 
left (or right) ideal generated by c is two-sided. If a, b are regular elements of R, 
then a is called a total divisor of b, in symbols a||b, if there exists an invariant element 
c such that al|c|b. We observe that an element is not generally a total divisor of itself; 
in fact a||a iff a is invariant. A simple ring has no non-unit invariant elements and so 
a||b in a simple ring implies that either a is a unit or b = 0. Further, in a principal 
ideal domain R, every two-sided ideal is of the form cR = Rc, where ¢ is 0 or an 
invariant element. 

We shall write diag(d;..... d,) for a matrix with d)..... d, along the main 
diagonal and zeros elsewhere; this notation will be used even for matrices that are 
not square. The exact size is usually clear from the context, or will be indicated 
explicitly. 


Theorem 7.2.1. Let R be a principal ideal domain and A €"R". Then there exist 
PeéEGL,,(R), Q € GL,(R) such that for some integer r < min (tm, 1), 


PAO = dase. case. 503 0), e;l|le,., #0. (7.2.1) 


272 Noetherian rings and polynomial identities 


Proof. The aim will be to reduce A to the required form by a number of invertible 
operations on the rows and columns. In the first place there are the elementary 
operations; for the columns they are 


(1) interchange two columns, 
(ii) multiply a column on the right by a unit factor, 
(i) add a right multiple of one column to another. 


For a Euclidean domain these operations are enough, but in the general case another 
operation is needed; this is best illustrated by reducing a | x 2 matrix. We have a 
matrix (a b) and need an invertible 2 x 2 matrix Q such that 


(a bhQ=(k O). (72.2) 


Clearly k, the highest common left factor (HCLF) of a and b, will be a generator for 
the right ideal generated by a and b. We may exclude the case k = 0, for then 
a = b =O and there is nothing to prove. Thus we have aR + DR = kR, say a = kay, 
b = kb,, and there exist c,,d) such that ka,d) — kbc, = k, hence a,)d\ — Byc) = 1. 
By hypothesis Rc, M Rd; is principal, with generator d\c} = c,d‘, and c,, d, have no 
common left factor, so there exist a,, b’, € Rsuch that d)a — c,)b, = 1. Thus we have 


Ree ae 
C| d —~C} a) 7 0 ] ) 
This shows that the first matrix on the left, C say, is invertible. It follows that 


(k 0)C = (a b) and (a b)C |! =(k 0), so C™! is the required matrix. Thus we 
have a fourth operation 


(iv) multiply two columns on the right by an invertible 2 x 2 matrix. 


We can now proceed with the reduction. If A = 0, there is nothing to prove; other- 
wise we bring a non-zero element to the (1, 1)-position in A, by permuting rows and 
permuting columns, using (i). Next we use (iv) to replace a,, successively by the 
HCLE of a,,; and a)2, then by the HCLF of the new a; and a,3, and so on. After 
n — | steps we have transformed A to a form where a). = 4)3 =... = 4, = 0. By 
symmetry the same process can be applied to the first column of A; in the course 
of the reduction the first row of A may again become non-zero, but this can 
happen only if the length (i.e. the number of factors) of a); is reduced; therefore 
by induction on the length of a,, we reach the form a,; ® Aj, where Aj, is 
m—1xn—1. We now apply the same process to A; and by induction on 
max(#1, 1) reach the form 


diag(a).a2...., (oe | eee ()). 


Consider a; and a>; for any d € R we have 


( 1 da QA 0 e da, ) 
0 |] ) 0 ar — 0 a> 
now we can further diminish the length of a; unless a, is a left factor of da» for 
all d € R, ie. unless aj)R D Ra>. But in that case a,;R D Ra»R D Rad; thus aj, |cla, 


7.2 Principal ideal domains 273 


where ¢ Is the invariant generator of the ideal Ra»R. Hence a,|{a2, and by repeating 
the argument we obtain the expression on the right of (7.2.1). The totality of column 
operations amount to right multiplication by an invertible matrix Q, while the row 
operations yield P and we thus have the equation (7.2.1). oi 


We remark that most of the PIDs we encounter will be integral domains with a 
norm function satisfying the Euclidean algorithm, i.e. Euclidean domains (possibly 
non-commutative). In that case we can instead of (iv) use the Euclidean algorithm, 
with an induction on the norm instead of the length, to accomplish the reduction in 
Theorem 7.2.1 (cf. the proof of Theorem 3.5.1). Further we can dispense with (11), so 
P, Q can in this case be taken to be products of elementary matrices. 

In the case of simple rings Theorem 7.2.1 still simplifies, since then every invariant 
element is a unit. 


Corollary 7.2.2. If R is a simple principal ideal domain, then every matrix over R is 
associated to a matrix of the form diag(1l.1..... Le Oo Oecd O)(a eR). 


Proof. If a||b, then either b = 0 or a is a unit. Now any unit can be transformed to | 
by applying (ii), so there can only be one diagonal element not | or 0. 


We note the consequences for modules over a PID: 


Proposition 7.2.3. Let R be a principal ideal domain. Then every submodule of a free 
module of finite rank n 1s free of rank at most n. Further, any finitely generated R- 
module is a direct sum of cyclic modules 


M=M,®...@M,. where M,., 1s a quotient of Mj. 


If moreover, R 1s simple, then every finite generated R-module is the direct sum of a 
cyclic module and a free module of finite rank. 


Proof. Let F be a free right R-module of finite rank n; we shall use induction on n. If 
a, denotes the projection on the first component R in F, then a@,M is a right ideal, 
principal by hypothesis, and we have the exact sequence 


0O—- M'>M—a\M —> 0. 


Here M’ is a submodule of ker a,, while a, M is free (of rank 1 or 0), hence the 
sequence splits and M = M'@a@M, and since M’ is free of rank < n —1 by the 
induction hypothesis, the first assertion follows. Now the rest is clear from Theorem 
7.2.1 and Corollary 7.2.2. oO 


In a skew field every non-zero element is a unit, so then Corollary 7.2.2 yields the 
well-known result that every matrix over a skew field is associated to I, @ 0, where r 
is the rank of the matrix. We can also use Theorem 7.2.1 to describe the rank of a 
matrix over the rational function field K(t), defined as the field of fractions of 
K[t], where K is any skew field and 1 an indeterminate. In K[t] we have the Euclidean 
algorithm relative to the degree function, hence K[t] is a PID. We further remark 


274 Noetherian rings and polynomial identities 


that if C is the centre of K, then for any polynomial f of degree mn over K and any 
4 € C such that f(A) = 0, we can write f = (t — A)g, where g has degree n — 1. By 
induction it follows that f cannot have more than n zeros in C. 


Lemma 7.2.4. Let K be a skew field with infinite centre C, and consider the polynomial 
ring K[t), with field of fractions K(t). If A = A(t) is a matrix over K[t}, then the rank 
of A over K(t) is the supremum of the ranks of A(a), a € C. In fact, this supremum is 
assumed for all but a finite number of values of a. 


Proof. Since K[t] is a PID we can by Theorem 7.2.1 find invertible matrices P, Q 
such that 


PAQ = diag fixicaa tes Olen 25 0), where f; € K[t]. (7.2.3) 


The product of the non-zero diagonal terms on the right gives us a polynomial f 
whose zeros in C are the only points of C at which A = A(t) falls short of its max- 
imum rank, and the number of these values cannot exceed the degree of f. | 


Exercises 


1. Show that over a PID any submodule of a free module (even of infinite rank) is 
free. 

2. Give the details of the proof that for a Euclidean domain operations (i), (iii) 
suffice to accomplish the reduction to the form (7.2.1). 

3. Let K be a skew field with centre C. Show that a polynomial over C may well have 
infinitely many zeros in K. How is this to be reconciled with the remark before 
Lemma 7.2.4? 

4. Let R be a PID and M a right R-module. An element x of M is called a torsion 
element if xc = 0 for some c € R°. Verify that the set tM of all torsion elements 
of M is a submodule of M (called the torsion submodule). 

5. If tM = 0, M is called torsion free. Show that any finitely generated torsion free 
module over a PID R is free and deduce that any finitely generated R-module 
splits over its torsion submodule. 

6. Let R be any ring and M a right R-module. Show that for any invariant element 
c of R, Mc is a submodule. If R is a PID and M is presented by a matrix 
diag(a,..... Bewi Oi ieae ges 0), a;||a,.,, then for any invariant element c such that 
a,|cla,,; we have M/Mc = R/a,R®...®@R/a,R ® R/cR@®...@ R/cR. Deduce 
that the a; are unique up to similarity, where a, b are similar if R/aR = R/bR. 


7.3 Skew polynomial rings and Laurent series 


In commutative ring theory the polynomial ring R[x] in an indeterminate x plays a 
basic role. The corresponding concept for general rings is the ring freely generated by 
an indeterminate x over R; in the notation of Section 2.7 this is the tensor ring R(x), 


7.3 Skew polynomial rings and Laurent series 275 


where A is the subring of R generated by |. This looks very different from R[x}; its 
elements are not at all like the usual polynomials, but we can simplify matters by 
taking the special case of those rings, whose elements can be written in the form 
of polynomials. Thus for a given ring R we consider a ring P whose elements can 
be written uniquely in the form 


f =a) +xa,+...4+x"a,. where a; € R. (7.3.1) 


As usual we define the degree of fas the highest power of x which occurs with a non- 
zero coefficient: 


d(f ) = max{i|a, 4 0}. (7.3.2) 


We Shall characterize the ring P under the assumption that the degree has the usual 
properties: 


D.1 d(f) > 0 for f £0, d(0) = —ox, 
D.2 d(f — g) < max{d(f ), d(g)}, 
D.3 d( fe) = d(f) + d(g). 


An integer-valued function d on a ring satisfying D.1-D.3 is called a degree function 
(essentially this means that —d is a valuation, see Section 9.4 or BA, Chapter 9). 
Leaving aside the trivial case R = 0, we see from D.3 that P is an integral domain 
and moreover, for any a € R”, ax has the degree 1, so there exist a®. a® € R such 
that 


ax = xa“ +a’. (7.3.3) 


By the uniqueness of the form (7.3.1), the elements a*,a° are uniquely determined 
by a, and a®* = 0 iff a= 0. By (7.3.3) we have (a+ b)x=x(a+b)*+ (at b)°, 
ax + bx = xa" +.a° + xb“ +b’, hence on comparing the right-hand sides we find 


(atb)® =a%+b*%, (a+b)? =a>+b’. (7.3.4) 


so a and 6 are additive mappings of R into itself. Next we have (ab)x = 
x(ab)* + (ab)°, a(bx) = a(xb® + b°) = (xa® + a°)b* + ab®, hence 


(ab)? =a°b*, (ab)? = a?b® + ab’. (7.3.5) 
Finally, 1.x = x.1 = x.1% + 1°, so 
!w=1, l’=0. (7.3.6) 


The first equation in (7.3.4)-(7.3.6) shows that @ is a ring homomorphism, and by 
the remark following (7.3.3) it is injective. The remaining equations show 6 to be a 
(1, a)-derivation; here we shall refer to 6 more briefly as an a-derivation. 

We next observe that the commutation rule (7.3.3), with the uniqueness of (7.3.1) 
is enough to determine the multiplication in P. For by the distributive law we need 
only know x'"a.x"b, and by (7.3.3) the effect of moving a past a factor x is 


x" ax"b = AL, bok oe lh x gig Ih. 


276 Noetherian rings and polynomial identities 


Now an induction on # allows us to write x"ax"b as a polynomial in x. Thus P is 
completely determined when a, 6 are given. We shall call P a skew polynomial ring 
in x over R relative to the endomorphism a@ and a-derivation 46, and write it as 
P = R[x: a, 6]. Thus we have proved the first part of 


Theorem 7.3.1. Let P be a ring whose elements can be expressed uniquely as poly- 
nomials in x with coefficients in a non-trivial ring R, as in (7.3.1), with a degree func- 
tion defined by (7.3.2), and satisfying the commutation rule (7.3.3). Then R is an 
integral domain, @ is an injective endomorphism, 6 is an a-derivation and 
P = R[x: a, 6} is the skew polynomial ring in x over R, relative to a,d. Conversely, 
given an integral domain R with an injective endomorphism a@ and an a-derivation 
6, there exists a skew polynomial ring R[x: a. 6}. 


Proof. It only remains to prove the converse. Consider the set R of all sequences 
(a;) = (ay. a)....) (a, € R), as right R-module. Besides the right multiplication by 
R we have the additive group endomorphism 


x: (a,)i>(a? +a%_,), where a.) =0. (7.3.7) 


Clearly R acts faithfully on RN by right multiplication, so we may identify R with its 
image in E = End(RW). Let P be the subring of E generated by R and x; we claim that 
P is the required skew polynomial ring. To verify (7.3.3) we have 


(jax Sea k= (ea) (ego) = (ca ica’ cy ya") 
(ayaa +a) = (cha +e ja + 6a) 


Hence ax = xa“ +a” in P, and (7.3.3) holds. It follows that every element of P can 
be written as a polynomial (7.3.1), and this expression is unique, for we have 


(1.0.0.2... )(ay ae MAT Hoe: + x"a,,) Se as aca 7; ea © aeons yy 


so distinct polynomials represent different elements of P. Finally it is clear that d(f ) 
defined as in (7.3.2) is a degree function, because R is an integral domain and «@ is 
injective. So P is indeed a skew polynomial ring. Ci 


Restricting attention to polynomial rings is analogous to singling out Ore domains 
among general integral domains, and in fact the skew polynomial rings were first 
considered by Ore in 1933; the general form of Theorem 7.3.1 was obtained by 
Jacobson in 1934. 

It is important to observe that the construction of the skew polynomial ring is not 
left-right symmetric, and besides R[x; a, 6] we can also introduce the left skew poly- 
nomial ring, in which the coefficients are written on the left. The commutation rule 
(7.3.3) then has to be replaced by 


xa=a’x +a’. (7.3.8) 


7.3 Skew polynomial rings and Laurent series 277 


In general the left and right skew polynomial rings are distinct, but when @ is an 
automorphism of R, with inverse f, say, then on replacing a by a” we can write 
(7.3.3) as a’x =xa+a"?, ie. 


xa =ahx—al. 


which is of the form (7.3.8). Hence we obtain 


Proposition 7.3.2. The ring R[x: a, 6] is a left skew polynontual ring whenever a 1s an 
automorphism of R. | 


In the construction of skew polynomial rings it was necessary to start from an 
integral domain because we insisted on a degree function; this is not essential, but 
it is the case mostly used in applications. Frequently the coefficient ring will be a 
field, possibly skew. In that case the skew polynomial ring is a principal right ideal 
domain; this follows as in the commutative case, using the division algorithm. 
Thus we are given f.g € P= K[x:a.6| of degrees #1, n, where n > m say. We 
have f=x"ayo+.... g=x"bo +.... where ay. by # 0. Hence g— fa, 'x"~ "bp is 
of degree < n. Given any right ideal a in P, let f € a be non-zero of least degree; 
then for any g € a we have d(g) > d(f), hence d(g —fh) < d(g) for some h€ P; 
we can repeat this process until we reach a term g — fh, of degree < d(f). But 
then g —fh, =0, because g ~fh, € a and f € a was of least degree. It follows 
that a = fP, so every right ideal is principal. This proves 


Theorem 7.3.3. Let K be a skew field, a an endomorphism and § an a-derivation of K. 
Then the skew polynomial ring K[x: a, 6| is a Euclidean domain and hence a principal 
right ideal domain. 


In particular, for a skew field K the skew polynomial ring K[x; a@. 6] is right 
Noetherian and hence also right Ore, by Proposition 7.1.9, so we can form its 
field of fractions. This is denoted by K(x: a, 8); its elements are fg~', where f, g 
are polynomials (7.3.1) with coefficients in K, 


Proposition 7.3.4. Any skew polynomial ring over a right Ore domain is again right 
Ore. 


Proof. Let R be a right Ore domain and K its field of fractions. Since @ is an injective 
endomorphism of R, it can by Corollary 7.1.2 be extended to an endomorphism of K, 
again denoted by a. Further, 6 gives rise to an R-inverting homomorphism from R to 
K,, and this gives an w-derivation of K, again written 6. Now we have the inclusions 


R[x. a.d] C K[x:a.d] C K(x: a, 4). 


Any element u € K(x: a@, 5) has the form fg, where f.g € K[x: a. 5]. By Proposition 
7.1.7 we can bring the finite set of coefficients of f, g to a common denominator, say 
f =fe, g = gc, where f.g € R[x: a. 5], ce R*. Now u = fe(gc)~' = fg~', so every 
element of K(x: @.5) can be written as a right fraction of elements of R[x: a. 4], 
and hence the latter 1s right Ore, by Corollary 7.1.6. + | 


278 Noetherian rings and polynomial identities 


The Hilbert basis theorem extends to skew polynomial rings relative to an auto- 
morphism (for endomorphisms it need not hold, see Exercise 2). 


Theorem 7.3.5. Let R be a right Noetherian domain, a an automorphism and 6 an 
a-derivation of R. Then the skew polynomial ring R[x: a, 6] is again a right Noetherian 
domain. 


Proof. This is essentially the same as in the commutative case (BA, Theorem 10.4.1) 
and will be left to the reader. Here it will be found more convenient to write the 
coefficients on the left of x; since a is an automorphism, this is possible. a 


We list some examples of skew polynomial rings. When the derivation is 0, we 
write R[x: a] in place of Rix: a. 0]. 


l.a@ = 1,6 =0. This is the ordinary polynomial ring R[x] in a central indetermi- 
nate (although R need not be commutative). 

2. The complex-skew polynomial ring C[{x: ~ | is the ring of polynomials with 
complex coefficients and commutation rule 


ax = xa. where a is the complex conjugate of a. 


The centre of this ring is the ring R[x-] of all real polynomials in x, and 
C[{x, ~ ]/(x° + 1) is the division algebra of real quaternions. More generally, let k 
be a field of characteristic not 2 with a quadratic extension K = k(,/b); this has an 
automorphism @ given by (r+ s./b)* = r — s./b. For any aé k’, K[x: a@]/(x" — a) 
is the quaternion algebra (a. b: k) (see Section 5.4). 

3. Let K be any commutative ring and denote by A, [K | the K-algebra generated by 
u, v over K with the relation 


= nS 1 (7.3.9) 


This ring is called the Weyl algebra on u, v over K. It may also be defined as the skew 
polynomial ring R[v: 1,'], where R = K|u} and ‘ denotes differentiation with respect 
to u. We observe that when K is a Noetherian domain, then so is A,[K ]. 

From (7.3.9) we obtain by induction on n, 


| 
wy —au su". 


hence uy". — vey" = ml lye" = Au") /Qu. A similar formula holds for 


commutation by u and by eae it follows that for any f € A,[K], 


> 
—™, 


(7.3.10) 


af 
aed s uf —fu== 
Ou 


. 


y 


From these formulae it is easy to show that for a field k of characteristic 0, A; [k] is 
a simple ring. For if a is a non-zero ideal in A,[k], pick an element f(u. v) 4 0 in a 
of least possible degree in u. Then cf /Cu = fv — vf € a, but this has lower degree and 
so must be 0. Hence f = f(¥) is a polynomial in v alone. If its v-degree is taken 
minimal, then cf/¢v = uf — fu = 0 and so f =c € k. Thus a contains a non-zero 
element of k and so must be the whole ring, i.e. A;[k] is simple, as claimed. 


7.3 Skew polynomial rings and Laurent series 979 


We observe that A,[k] is an example of a simple Noetherian domain, not a field. 
For a field & of finite characteristic p, A,[k] is no longer simple, since it has the centre 
k{u?, v?]. 

4. The translation ring k(x. y|xy = y(x+1)) may be described as R= Aly; o], 
where A is the polynomial ring k[x] with the shift automorphism o : x!t>x-+ 1. 

5. Let k be a field of prime characteristic p and F : aia? the Frobenius endo- 
morphism. Then k|x: F] is a skew polynomial ring whose field of fractions k(x: F) 
has an inner automorphism inducing F, namely conjugation by x. 

More generally, if k is any field, even skew, with an endomorphism a, then k(x: @) 
is an extension with an inner automorphism inducing @ on k, because (7.3.3) now 
reads ax = xa*. Similarly, if 6 is an @-derivation, then k[x: a, 4] is a ring with an 
inner @-derivation inducing 6, as we see by writing (7.3.3) in the form 
a” = ax — xa". 

6. Let K be a commutative field with an automorphism a@ of finite order n, and 
consider the field of fractions E= K(x: a). If k is the fixed field of a, then 
F = k(x") is contained in the centre of E, as is easily checked. Moreover, K(x") is 
a commutative subfield, a Galois extension of F of degree n, and provided that K 
contains a primitive n-th root of 1, the structure of E is given by the equations 


ax'=x'a® forallac K.i=0,]..... n— 1, 


It follows that k(x") is the precise centre of E and E is of dimension m- over its centre, 
in fact a crossed product (see Section 5.5). 

7. If D is a skew field not algebraic over its centre k, then D @ k(t) is a simple 
Noetherian domain (by Theorem 5.1.2), but not Artinian, hence not a skew field. 

8. Let R be an integral domain with an automorphism a. In the skew polynomial 
ring R[x; a] the powers of x form an Ore set, and the ring of fractions consists of all 
polynomials 5~ ‘ _x'a; involving negative as well as positive powers of x. Such an 
expression is called a skew Laurent polynomial and the resulting ring may be written 
R[x, x7 !: a). 

For each polynomial f € P = R[x: a. 5] of the form (7.3.1) we can define its order 
o(f) as the lowest power of x occurring with a non-zero coefficient: 


o(f ) = min{ila, # 0}. 
This function has the properties of a valuation on P: 


O.1 o(f) > 0 for all f € P.0(0) = x, 
0.2 o(f ~— g) > min{o(f ). ofg)}, 
O.3 o( fg) = o(f ) + o(g). 


Taking first the case 6 = 0, we can form the ring R{[x; @]] of formal power series 
over R as the set of all infinite series 


f=ajt+xa, +x art.... (7.3.11) 


with componentwise addition and with multiplication based on the commutation 
rule ax = xa“. There is of course no question of convergence here; we regard 
(7.3.11) as a series in the purely formal sense. We can describe f equally well as an 


280 Noetherian rings and polynomial identities 


infinite sequence (a;) = (aj. a)....), with addition (a,) + (b,) = (a; + bj) and with 
multiplication 


(a;)(b;) = (c,). where c, = pea (7.3.12) 


Alternatively we can regard R[[x: a]] as the completion of the skew polynomial ring 
R[x: w] with respect to the powers of the ideal generated by x; these powers define a 
topology called the x-adic topology. This topological viewpoint is not essential for 
the construction, but it helps in understanding the situation. 

Let R be a ring and @ an automorphism of R. The powers of x form an Ore set in 
R[[x: a}]] and by taking fractions we obtain the ring R((x; @)) of all formal Laurent 
series 


~ 


oe a Ea Gna ate Le Sega bea eae ast. (7.3.13) 


_—?- 


This is again a ring, with the same multiplication (7.3.12); here the restriction to 
finitely many negative powers is necessary to ensure that the multiplication rule 
(7.3.12) has a sense. This is also the reason for taking aw to be an automorphism, 
since now j may take negative values in (7.3.12). 

Let us now consider a skew polynomial ring R[x: a, 5], where 5 may be non-zero, 
but @ is still an automorphism, and ask whether a power series ring can be formed. 
If we attempt to define the multiplication of power series by means of the com- 
mutation formula (7.3.3), we shall find that (apart from a more complicated form 
for the coefficients of the product), the product cf, where c € R, cannot always be 
expressed as a power series, because there will in general be contributions to the 
coefficient of a given power x’ from each term cx"a,, (nm > r) and so we may have 
infinitely many such terms to consider. In terms of the x-adic topology we can 
express this by saying that left multiplication by c € R is not continuous; this follows 
from (7.3.3), because when a # 0, we have o(ax) < o(x). 

One way to overcome this difficulty is to introduce y = x7 
terms of y. We find 


| and rewrite (7.3.3) in 


: " 8 A- ) 
ya=atyt+ya’y=a%y+a%+ya° yo =.... 
hence by induction we obtain 
ane u ba. 2 A 2 : 
VASA VPA YS HP yd 8 Ras (7.3.14) 


With the help of this commutation formula we can multiply power series in y and 
even Laurent series. We observe that in passing from x to y=x~ | we have also 
had to change the side on which the coefficients are put; of course this is immaterial 
as long as @ is an automorphism. To be precise, from any skew polynomial ring 
R[x: a. 6} we can form a skew power series ring in x~', with coefficients on the 
left; in order to define Laurent series in x7 ' we need to assume that a is an auto- 
morphism. We shall not pursue this point but note one result which illustrates 
the usefulness of power series. 


7.3 Skew polynomial rings and Laurent series 281 


Theorem 7.3.6. Let K be a skew field and a an automorphism of K. Further denote the 
centre of K by C and the subfield of C fixed under a by Cp. If no positive power of a is an 
inner automorphism of K, then the centre of K(x; a) is Co. 


Proof. Every element of K(x: a) can be written as a rie series ce 'a,. If this 
lies in the centre, then fc = cf for all c € K, ie. }> x'(a,c — c* a;) = 0, hence 


ac=c*a; forallie Zandall ce K. (7.3.15) 


If a' £0 for some i > 0, then @’ is inner, by (7.3.15); in case i < 0, @ ~' is inner, but 
this contradicts the hypothesis. Hence a' = 0 fori 4 0 and f = a € K. Now (7.3.15) 
reads agC = Cay, SO a) € C, and since xay = ayx = xaj, we have f = ay € Cp. 
Conversely, every element of Cy commutes with every element of K(x; a). B 


Finally we note a rationality criterion for power series, which applies also in the 
skew case. 


Proposition 7.3.7. Let K be a skew field with an automorphism a, and consider the 
natural embedding of K(x: a) in the skew field K((x:a@)) of formal Laurent series. 
A given series f = )-x!a, lies in K(x: @) if and only if there exist integers r, ny and 
elements c),....¢, € K such that 


Cj =F qt 30. +... +a%_ a, forall n > ny. (7.3.16) 


Gy; =a. pale 


= | 
Proof. The series flies in K(x; a) iff there is a polynomial ¢ with constant term | such 
that fg is a Laurent polynomial. Writing g = 1 — xc) —... — x'c,, we require that in 
the product ( }>x'a;)(1 — >> x/c;) all coefficients of powers beyond a given one, say 
x'™ vanish. On equating the coefficient of x” to 0 we just obtain (7.3.16), and the 
conclusion follows. 


Exercises 


1. Supply the details of the proof of Theorem 7.3.5, and point out where the fact 
that w is an automorphism is used. 

. VJategaonkar, Koshevoi) Let K be a skew field and @ an endomorphism of K. 
Show that K|x: a} is a left Ore domain iff @ is surjective (and hence an auto- 
morphism). Using Proposition 7.1.8, obtain an embedding of the free algebra 
K(x, y) in a right Ore domain, and hence an embedding of K(x, y) in a skew 
field. 

3, Find a localization of the Weyl algebra over a field of finite characteristic, which 

is a crossed product. 

4, Show that for any ring K, a Weyl K-ring A,[K ] on u, v may be defined by (7.3.9), 
where #, v commute with the elements of K. Show that if K is a simple ring of 
characteristic 0, then so is A;[K ]. 

5. Show that if R is a Noetherian ring with an automorphism a, then R[x, x~ 
is again Noetherian. 


to 


eg] 


282 Noetherian rings and polynomial identities 


6. Let K be a ring with an injective endomorphism a. Put Ky = K and take 
an ascending chain of K-rings K,,, all isomorphic to K, such that K,, - | is identi- 
fied with the image of K,, under a. Show that their union is a ring K!*! to which 
@ can be extended as an automorphism. Verify that with the Laurent polynomial 
ring K[x.x~':@] the subring Ux""Kx~"” is isomorphic to K ©. 

7. Let F C E be any fields and a an automorphism of E mapping F into itself. Show 
that E(x: a) N F((x: a@)) = F(x: a). 

8. Let K be a skew field with centre C, w an automorphism of K and Cy the subfield 
of C left fixed by a. Suppose further that a” = 1, but no power a’ (0 <1 < 1) is 
inner. Show that the centre of K((x: a@)) is Co((x")) and the centre of K(x: a) is 
Cy(x"). 

9. (P. Draxl) Let k be a fleld of characteristic not 2 and K = k(t), uo. 3. uy), 
where the 1, are independent (central) indeterminates. Put K, = k((x;)), Ky = 
Ky ( (02: a@)). Ks = Ko((x3)), Ky = K3((xy: B)), where wix) lo —x,, Bix 
— x3, and by identifying 1, with x; show that the K-subalgebra D of K; generated 
by x,....,X; is a central division algebra of degree 4 over K, of the form 
D = (1, uo; K) @ (43. uy: K). Using the representation as Laurent series, 
show that any multiplicative commutator on D has the form 1+ f, where f 
involves only positive powers of the x’s. Deduce that if k contains a primitive 
4th root of 1, then SK,(D) is non-trivial. (Hint. Use Exercise 8 to show that 
D has centre K and verify that the 4th root of 1 has reduced norm 1. For 
more details of this solution of the Tannaka—Artin problem see Draxl (1983)). 

10. Show that the trace group, defined in Section 4.5, vanishes on the Weyl algebra 
over a field of characteristic 0. 


7,4 Goldie’s theorem 


The main structure theorem for Artinian rings states that a semisimple ring is a 
direct product of simple rings and each of the latter is a full matrix ring over a 
skew field (Wedderburn theorems, BA, Section 5.2). Such precise information is 
not to be expected for Noetherian rings, but in this case there is a reduction theorem 
due to Alfred Goldie, which implies that a Noetherian semiprime ring has a quotient 
ring which is semisimple (hence Artinian), and here prime Noetherian rings cor- 
respond to simple Artinian rings. Our aim in this section is a proof of these results, 
but some preparation will be necessary. We recall that a ring R is called prime if R 4 0 
and aRb = 0 implies a = 0 or b = 0; a ring R in which aRa = 0 implies a = 0 is called 
senmprime. The precise relation between these rings will be described in Chapter 8. 

We shall also need a substitute for the dimension of a vector space which can be 
applied to general modules; this is the notion of uniform rank. Let M be an 
R-module (over any ring R); we recall that a submodule M’ of M is said to be large 
in M and M 1s an essential extension of M’ if 


N#A0>NOM £0 _ for any submodule N of Af. (7.4.1) 


7.4 Goldie’s theorem 283 


In particular, R itself may be considered as a left or right R-module; we then obtain 
large left or right ideals, also called left large or right large, respectively. 
We list some properties of large submodules used later: 


L.1 If M’ is large in M and M” is large in M’ then M” is large in M. 
For if0 4 NC M,thenNNM’'40,henceNOM" =(NOM‘)OM" 40. & 


L.2 Let M bea right R-module and M’ a large submodule of M. For any m € M, the set 
c= {x € R|mx € M’‘} 18 a large right ideal in R, and if m # 0, then0 Ame C M’. 
Clearly ¢ is a right ideal; for any non-zero right ideal a in R, either ma = 0, in 
which case a C c, or ma 4 0; then #1RMM’ 4 0 and for any a € a such that 
0 4maeéeM wehave 0 4 aR C ca, which shows ¢ to be right large. If more- 
over mae M’',thenaéc,so0 Ame CM’, | 


L.3 For any nilpotent ideal n of R, the left annihilator (n); = {x € R|xn = 0} is a right 
large ideal of R. 
For, given c#0, there exists s>1 such that cn’ ~'=0, cn’ =0, and 
cn’ ' CeRN (n)). a 


L.4 A module is semisimple if and only if it has no proper large submodules. 


For in a semisimple module every proper submodule has a non-zero complement 
and so cannot be large. Conversely, if Mf has no proper large submodules and 
M' C M is given, we can by Zorn’s lemma find a submodule N which is maximal 
subject to NM‘ = 0. If P is a non-zero submodule such that PN (N + M') = 0, 
then the sum P+ N+ M’ is direct and so (P+N)Q1M’ =0, but this contradicts 
the maximality of N. Hence PO (N +M’') 40 for all P40, so N+M_° is large 
and therefore equal to Af. Hence N is a complement of M’ in M and this shows 
M to be semisimple. | 


We recall that a module M is called uniform if M 4 0 and every non-zero sub- 
module of M is large. For example, an integral domain R is right uniform (i.e. uni- 
form as right R-module) iff it is a right Ore domain. With the help of uniform 
modules we can define a form of dependence relation which leads to a notion of 
rank in general modules. Let M be an R-module (for any ring R) and denote by 7 
the collection of all its uniform submodules. On Y we introduce a dependence rela- 
tion as follows. If N, P)....,P, € WY, then N is said to be dependent on P)..... Pe 
if NO >_> P,40. Generally N is said to be dependent on a (possibly infinite) 
family of uniform submodules if it is dependent on a finite subfamily. A set of uni- 
form modules is independent if no member is dependent on the rest. This depen- 
dence relation satisfies the following conditions: 


D.O In any family ¥ of uniform submodules of M, each member of ¥ 1s dependent 
On FA. 

D.1° (Transitivity) If N is dependent on the independent family and each member of 
F is dependent on G, then N 1s dependent on &. 


284 Noetherian rings and polynomial identities 


D.2 (Exchange property) If N is dependent on .*F U{M‘} but not on .F, then M’ is 
dependent on .¥ U{N}. 


We note that these conditions are like those listed in Section 11.1 of BA (except 
that D.1' is a weaker form of D.1 listed there). 

D.0 is clear; to prove D.1’ we may take the families to be finite, say N is dependent 
on the independent family {P,,.... P,} and each P, is dependent on {Q;...., Q,}. By 
hypothesis, NM >~ P; 4 0, so there exists n € N, n 4 0, such that 


n=p,+...+p,. where p, € Pj. (7.4.2) 


Writing Q = 5° Q,, we have to show that NON Q 40. If p; € Q for all i, the con- 
clusion follows from (7.4.2); otherwise we choose an equation (7.4.2) with n 4 0 
such that the least number of p, are not in Q. If p,; ¢ Q say, then since P; is uniform, 
PRN (P,Q) £0, say 0 pic € Q. Now in 


NOS DiC Px ase Pec. (7.4.3) 


there are fewer terms outside Q than in (7.4.2) and nc # 0, because p,c 4 0 and the 
sum )~ P, is direct. This contradiction shows that NM Q # 0, as claimed. 

The exchange axiom D.2 follows easily: if N is dependent on P),.... P, but not on 
Po... P, then there exists n € N, » 40, such that (7.4.2) holds with p, 4 0, by 


hypothesis. If we now rewrite (7.4.2) as p) =n — p> — ... — pr, we have a relation 
showing P, to be dependent on N. P)..... P,, so D.2 holds. Cc 


Let us also note that if Nis dependent on a family ¥ of uniform submodules, then 
so is any non-zero submodule of N. For if QM P # 0, where P is a sum of terms in 
#F,andO4~N' CN, then N'ONP=N’N(NNOP) £0, because N is uniform. 

A set of uniform submodules of M on which every uniform submodule depends 
will be called a spanning set, and an independent spanning set is a basis; we recall 
from BA, Section 11.1 that the bases are just the independent spanning sets and 
every independent set is contained in a basis. We also have the exchange lemma 
(BA, Lemma 11.1.2, which used D.1 only in the weaker form D.1'), which states 
that for an independent set -¥ and a spanning set &, it is possible to complete 
to a basis by adjoining a subset &’ of Y, and if & is finite, then so is *, and 
IF UG <[F). 

We remark that such bases need not exist (see Exercise 7), but if every non-zero 
submodule of M contains a uniform submodule, then we can by Zorn’s lemma find a 
direct sum of uniform submodules which is large in M/, and hence a basis. We shall 
mainly be interested in modules with a finite basis and we shall find conditions for 
this to exist in Proposition 7.4.1 below. 

From the exchange lemma we can deduce in the usual fashion that if M has a finite 
basis, then any two bases have the same number of terms (see BA, Corollary 11.1.6). 
This number is called the uniform rank or simply the rank of M, and is written rk M. 
It is clear that rk M = 1 iff M is uniform, and in a module of rank n, any direct sum 
of non-zero submodules has at most n terms; it has exactly n terms iff each term is 
uniform and the sum is large in M. The conditions for a module to have finite rank 
are easily stated: 


7.4 Goldie’s theorem 285 


Proposition 7.4.1. Let R be any ring and M an R-module which contains no infinite 
direct sums of non-zero submodules. Then there is a direct sum of uniform submodules 
which 1s large in M, so that M has a rank, and in fact the rank is then finite. Conversely, 
a module of finite rank contains uniform submodules, but no infinite direct sum of 
submodules. 


Proof. We begin by showing that every non-zero submodule N of M contains a 
uniform submodule. For if N is not itself uniform, then it contains a direct sum 
N, ® Ny; now either N, is uniform or it contains a direct sum N> @ N, and con- 
tinuing in this way, we obtain in M the direct sum 


NE ONe Once 


Since M contains no infinite direct sums, this process must break off, which can 
happen only when we reach an N’ which is uniform. Hence N contains a uniform 
submodule. 

Now let >>; U; be a direct sum of uniform submodules in M; such a sum exists, 
e.g. for r = 0. Either it is large in M or we can find V 4 0 such that VM 9° U; 4:0. 
By the first part of the proof V contains a uniform submodule U,,, and now 
5-’~! U; is a direct sum of r+ 1 terms. If we continue in this way, the process 
must break off, because M has no infinite direct sums, and it can end only when 
we have a direct sum of uniform submodules which is large in M. This shows that 
M has a rank and that this rank is finite. Conversely, if rk M = n, then we know 
that any direct sum contains part of a basis and so cannot have more than n 
terms. | «| 


This result may be applied to R itself, as left or right R-module, and in this way we 
obtain the notion of left rank and right rank of R. For example, an integral domain 
has right rank 1 iff it is a right Ore domain. In fact, by Proposition 7.1.9 we obtain 


Corollary 7.4.2. An integral domain which has finite right rank, necessarily has right 
rank 1. 0 


We remark that for a submodule M’ of M we have rk M’' < rk M, with equality iff 
M' is large in M. On the other hand, going over to a quotient module may well raise 
the rank, e.g. rk Z = 1, but rk(Z/m) > 1 unless m1 is a prime power. 

Let R be any ring and M be a right R-module. For any subset S of M we define the 
right annihilator of S in R as 


(S), = {x © R|Sx = 0}. 


It is clear that (S),. is a right ideal; if S is a submodule, (S), is even a two-sided ideal. 
When S 4 0, (S), cannot contain 1 and so will be proper. If S = {a}, we write (a), 
instead of ({a}),. In particular, this defines right annihilators of subsets of R; the left 
annihilator of a subset S of R (or more generally, of a left R-module) is defined simi- 
larly. The ring R is said to satisfy the maximum condition on right annihilators if every 
collection of right annihilators in R (of subsets of R) has a maximal member, e.g. any 
right Noetherian ring satisfies the maximum condition on right annihilators. 


286 Noetherian rings and polynomial identities 


Proposition 7.4.3. Let R be a senuprime ring with maximium condition on right 
annthilators. Then every nil left or right ideal is zero. 


Proof. It is clearly enough to prove the result for principal ideals, and we need only 
consider left ideals, for Ra is nil iff (xa)" = 0 for all x and suitable » = n(x). Now 
(xa)" = 0 implies (ax)"*! = a(xa)"x = 0, hence if Ra is nil, then so is aR. 

Thus let Ra be a non-zero nil left ideal and choose a maximal annihilator not equal 
to R of the form (xa),. Writing b = xa, we choose y € R; if yb 4 0, take 1 > 2 such 
that (yb)'~' £0, (yb)" = 0. Then (b), € ((yb)"~'), and by maximality we have 
equality here; since yb € ((yb)'~'),, we conclude that byb = 0. This holds for all 
y € R, even when yb = 0. Hence bDRb = 0, b $ 0, in contradiction to the fact that 
R is semiprime. Hence every nil left (or right) ideal is 0. a 


We shall also need the notion of a singular submodule. For any ring R and any 
right R-module M consider the set 


Z(M)={m € M|(m), is a large right ideal of R}. 


This set is a submodule of M, called the singular submodule. To verify the module 
property, let u.v € Z(M); then a =(u), and b = (v), are right large, hence so is 
aMb and (u—v)(aNb) =0, therefore (u — +), is right large. Further, if a € R, 
then (ua), is right large, by L.2, for x € (wa), @ ax € (u), and the latter is right 
large by hypothesis. Thus Z(M) is indeed a submodule. In particular, taking 
M =R, we obtain the right singular ideal Z(R) of R. By what has been shown, it 
is a right ideal; in fact it is two-sided, for if (a), is right large, then so is (ba), > (a),. 

Although Goldie’s theorem is concerned with Noetherian rings, it applies to a 
somewhat wider class, defined as follows. A ring which is of finite right rank and 
satisfies the maximum condition on right annihilators is called a right Goldie ring. 
In particular, every right Noetherian ring is right Goldie; of course the converse is 
false, as the example of commutative integral domains shows. In a right Goldie 
ring the right singular ideal is nilpotent; we shall only need the special case where 
the ring is semiprime, when it is a consequence of the next result (see Exercise 9 
of Section 8.5 for the general case). 


a"R+(a"), is right large. Moreover, (a'), =(a"), for all »v>n and the sum 
a"R+(a"), is direct. 


Proposition 7.4.4. In a right Goldie ring R, for each a € R there exists n = 0 such that 


Proof. The sequence (a), € (a), © ... becomes stationary, say (a'), = (a"), for 
v>n. It follows that (a"),  Qa"R = 0, for if a”.a"x = 0, then x € (a""), = (a"),, 
hence a"x = 0. If ¢ is any right ideal such that cM as +(a").) =0, then the 
sum c+a"c+an"c+.. is direct, for if atc, +a'** ney +...+a'"c, = 0, 
where c; € ¢, c; #0, then c, € (a”), +a"R, which is a contradiction. Since R has 
finite rank, a*"c =0 for some s> 1, Le. ¢ C (a), = (a"), and it follows that 
c=cM(a"R+(a"),) £0. This proves that a"R + (a"), is right large, and we have 
seen that the sum is direct. Fe | 


We note two extreme special cases of this result. 


7.4 Goldie’s theorem 287 


Corollary 7.4.5. In a right Goldie ring R, if a is left regular, then aR is right large. 


Proof. In this case (a"), = 0 for all and if a”R is right large, then sois@aR. 


Corollary 7.4.6. [1 a semiprime right Goldie ring R, Z(R) = 0. 


Proof. By Proposition 7.4.4, if (a), is right large, then a is nilpotent. Hence Z(R) is a 
nil ideal and so must be zero, by Proposition 7.4.3. = 


The next lemma, essentially the converse of Corollary 7.4.5, will be needed for 
Goldie’s theorem, but is also useful elsewhere. 


Lemma 7.4.7. In a semiprime right Goldie ring any large right ideal contains a regular 
element. 


Proof. By Proposition 7.4.3, any nil right ideal of R is 0, so a non-zero right ideal a 
will contain a non-nilpotent element a. By Proposition 7.4.4 we can find a power a, 
of a such that (4,), = (aj), and so a,R + (aj), is direct. If (a,), a 4 0, we choose 
a> € (a,), Oa such that ay 4 0 and (a>), = (a3),. Then a2R + ((a,), N (az), Na) isa 
direct sum contained in (a,), a, hence the sum a,R+ aoR + ((a,), 9 (az), Ma) is 


direct. If (a)),.7 (a2), Va # 0, we can continue the process; at the n-th stage we have 
a direct sum 


aRt+...+a,R+ ((a)),A...0 (an), a), (7.4.4) 


where a; € (a,), ..M(a,-,), Oa and (a;), = (az), #R. 
Since R has finite rank, the process must stop; if this happens at the 1-th stage, we 
have 


cS ore ween ik ere i to pees 6 (7.4.5) 
So far a was any right ideal 4 0; if we take a to be right large, then by (7.4.5) we find 
Ca ee 2 wh Cae ea) (7.4.6) 


Put c=) a,;; by construction cé€a and we claim that c is regular. If cx = 
| a;x = 0, then by the directness of (7.4.4), aix = 0 for all i, hence x = 0 by (7.4.6), 
i.e. (c), =O. By Corollary 7.4.5, cR itself is right large, hence (c),; C Z(R), but 
Z(R) = 0, by Corollary 7.4.6, so (c); = 0 and c is regular. o 


In the presence of maximum conditions the definition of a right Ore set can be 
simplified a little; this is sometimes useful, although it is not actually needed here. 


Proposition 7.4.8. Let R be a ring with maximum condition on right annihilators and 
let S be a multiplicative subset of R such that 


(i) foranyaeR.s€ S.aSAsRF @, 
(ii) foranyae R.s€S,as=OSa=0. 


Then S is a right Ore set consisting of regular elements. 


288 Noetherian rings and polynomial identities 


Proof. We need only show that for any a € R,s € S,sa=0=> a = 0. By the maxi- 
mum condition the sequence 


Ce Ce eee 
becomes stationary, say (s"), = (s"*'),. If sa = 0, then by (i) there exist s’, a’ such 
that as’=s"a’; hence s"*'a'=sas'=0, so a’ e(s"t!), =(s"),, and so 
0=s"a’ =as’, hence a = 0 by (ii) | 


We now come to the main result of this section. 


Theorem 7.4.9 (Goldie’s theorem). A ring R has a total quotient ring Q which is 
semisimple if and only if R is a semiprime right Goldie ring. Moreover, R is simple 
Artinian if and only if R is prime right Goldie. 


Proof. Assume that R is semiprime right Goldie and let S denote the set of all regular 
elements of R. We shall show that S is a right Ore set. Given a € R, s € S, define 


c= {x € Rlax © sR}. 


By Corollary 7.4.5, sR is right large, hence by L.2, so is c, therefore it contains a 
regular element (Lemma 7.4.7). This shows that sR M aS 4 @, so S is a right Ore set. 
Let Q = Ry be the total quotient ring. If 2l is a large right ideal of Q, then 201M R is 
right large in R and so contains a regular element. This must be a unit in Q, hence 
2 = Q, i.e. Q has no proper large right ideals, and so is semisimple, by L.4. 
Conversely, let R be a ring with a semisimple quotient ring Q. We shall show that 
for any right ideal c of R the following conditions are equivalent: 


(a) cas right large in R, 
(b) cQ=Q, 


(c) ¢ contains a regular element of R. 


(a) + (b). Assume that c is right large and let 2 be any non-zero right ideal of Q. 
Then 2OR 40, hence ANRAc £0, and so ANcQ #0. This shows that cQ is 
right large in Q, hence cQ = Q by L.4. Conversely, if cQ = Q and a is a non-zero 
right ideal in R, then aQNcQ $0, hence aNc #0, so ¢ is right large in R. 

(b) = (c). If eQ = Q, then 1 = as~', where a € c and s is a regular element in R, 
hence a =s is regular. Conversely, if c contains a regular element, then clearly 
CO = 0: 

We can now complete the proof of the theorem by verifying that R is semiprime 
right Goldie. 

Let n be a nilpotent ideal of R. Then (n), is a right large ideal in R, by L.3, so it 
contains a regular element and it follows that n = 0. Thus R is semiprime. 

Next let a = }\a; be a direct sum of non-zero right ideals in R which is right 
large; then a contains a regular element c, say: 


C=>xX,)+...+4%x,. where x, € a. 


Now cR is right large and is contained ina;, +... + a;,, hence the latter sum is right 
large, so the sum )° a; was finite. 


7.4 Goldie’s theorem 289 


For any subset I of R, its right annihilator in R is obtained by intersecting its 
annihilator in Q with R. In an obvious notation we have 


Gy =)? Ak. 


Now the maximum condition for right annihilators follows for R, because it holds 
in Q. Finally, if R is prime, then so is Q, by Proposition 7.1.10. It follows that Q 
is simple. Conversely, if Q is simple, then R must be prime, for if a, b are ideals 
in R such that ab = 0, then bQaNM R is an ideal of R whose square is zero; since R 
is semiprime, we have bQaMR=0 and hence bQa = 0. But Q is simple, so it 
follows that a = 0 or b = 0, and this shows R to be prime. he 


For prime rings Theorem 7.4.9 was proved by Goldie | 1958] and extended by him 
to semiprime rings in 1960. The result raises the question of localizing relative to a 
prime ideal. lf R is a Noetherian ring with a prime ideal p, and C, denotes the set of 
all elements of R that are regular mod p, i.e. whose image in R/p is regular, then it is 
not necessarily the case that C, is a right Ore set. Here it is necessary to consider 
more than one prime (a so-called ‘clique’ of prime ideals), and to localize at the 
set of elements that are regular modulo all the prime ideals considered. For a detailed 
account of this method see Jategaonkar (1986), McConnell and Robson (1987) or 
Goodearl and Warfield (1989). 


Exercises 


1. (Kasch—Sandomierski) Show that the socle of a module is the intersection of 
all its large submodules. (Hint. Show first that every submodule is a direct 
summand of a large submodule). 

. Let A be the direct sum of an infinite and a finite (non-zero) cyclic group. Show 
that for the dependence relation defined for A as Z-module the form of D.1’ 
without the independence of ¥ (D.1 of BA) does not hold. Show also that 
this form of D.1° does hold on any torsion-free module. 

. Show that an Artinian semiprime ring is self-injective. 

. Find the rank of Z/m, for a positive integer mm. 

. Show that for any ring R and any n > 1, rk(R,,) = merk R. 

. Show that the injective hull of a uniform module is indecomposable. Use this 
fact and the Krull-Schmidt theorem to give another proof of Proposition 7.4.1. 
7. Let F be the ring of all real continuous functions on the unit interval with point- 

wise operations: (f + g)(x) = f(x) + g(x), (fe)(x) = f(x)g(x). Show that F has 
no uniform ideals. 
8. Show that the maximum condition on left annihilators is equivalent to the mini- 
mum condition on right annihilators. 
9. Show that a reduced ring (x» = 0 > x = 0) is non-singular. 
10. Show that over an Ore domain every finitely generated flat module is projective. 
(Hint. Use Further Exercise 16 of Chapter 4.) 

11. Show that every left and right self-injective ring is its own quotient ring (RK is 

right self-injective if Rp is injective). 


tO 


NT Ww 


290 Noetherian rings and polynomial identities 


12. Show that a non-zero submodule of a direct sum of uniform modules has a uni- 
form submodule. (Hint. First find a non-zero submodule in a finite direct sum.) 


7.5 Pl-algebras 


Let k be a field and F = k(x,....,x,) the free k-algebra on x).....: x7. The elements 
of F are called polynomials and a k-algebra A is said to satisfy the polynomial identity 


DURES os cj =0, (7.5.1) 


if p is an element of F which vanishes for all values of the x’s in A. If A satisfies a non- 
trivial polynomial identity, i.e. an identity (7.5.1) where p is not the zero poly- 
nomial, then A is called a PI-algebra. Many of the results proved for Noetherian 
rings have their counterparts for Pl-algebras. It will simplify matters to assume a 
field of coefficients, though it is possible to consider more general coefficient rings 
(see Procesi (1973) or Rowen (1980)). 


Examples 


1. Every commutative k-algebra satisfies the identity xy — yx = 0 and so is a PI- 
algebra. 
. Every Boolean ring satisfies the identity x° — x = 0. 
3, Every finite-dimensional algebra satisfies an identity. If dim A < n, then A satis- 
fies the identity 


in) 


where o runs over all permutations of 1... ., n and sgn(a) is 1 or —1 according as 
a is even or odd. S,, is called the standard polynomial and (7.5.2) the standard 
identity of degree n. For any a,....,a, € A it is clear that S,(a@),....a,) = 0 if 
two of the a’s coincide. Now take a k-basis e),...., e-(r = dimA <n) of A; 
then for any q@)..... Bye Ay Si Qetiacs a,) can by linearity be written as a 
linear combination of terms S,,(e,....¢,,). Since r <n, at least two of the e’s 
must coincide, so all terms vanish. This shows that A satisfies (7.5.2). 

4. If A is a commutative k-algebra, then 9M,,(A) satisfies the standard identity of 
degree n- + 1, for we have a basis of n° elements, so the result follows as in 
Example 3. In Theorem 7.5.8 below we shall see that 2t,,(A) actually satisfies 
Ss, = 0. 

5. If every element of A is algebraic over k, of degree at most n, each element of A 
satisfies an equation 


x" +tax""~'+...+a,=0. where a; € k. (7.5.3) 


7.5 Pl-algebras 791 


If the equation for some element of A has degree less than n, we can bring it to the 
form (7.5.3) by multiplying by a power of x. Writing [x, y] = xy — yx, we obtain 
from (7.5.3), 


[x"y] ta[x"'.y] ates ae ey | = 0. 


Thus the commutators in this expression are linearly dependent and so A satisfies 


the identity 
S.([x.y], [aru y], -.-. {x7 9) = 0. 


We collect some elementary facts about Pl-algebras. 


Proposition 7.5.1. Any subalgebra or homomorphic image of a Pl-algebra is again a 
PI-algebra. 


Proof. The proof is immediate. a 
Since the free algebra is clearly not a Pl-algebra, we deduce from Proposition 7.1.8, 


Corollary 7.5.2. Every PlI-algebra which is also an integral domain is a left and right 
Ore domain. a 


A polynomial p and the corresponding identity p = 0 is said to be multilinear if 
it is homogeneous of degree 1 in each variable. In a PlI-algebra we can always find 
multilinear identities: 


Proposition 7.5.3. Any algebra A satisfying a polynomial identity of degree n also satis- 
fies a polynomial identity of degree n which ts multilinear. 


Proof. Let p = p(x).....: x4) be a polynomial of degree n which vanishes identically 
on A, and let r be the highest degree to which any variable occurs, say p has degree r 
in x). Ifr > 1, replace p by p(x; + xy41.%2....) — play, .---) — pleat). 22... -)5 
since (x; +xXy41)' — x, —x,,, #0 (even in finite characteristic, because x)xXy41 4 
Xq¢+1X,), we get a polynomial in which x), x74 occur, but of degree less than r and 
the degree of the other variables is not raised. By a double induction, on the highest 
degree r and on the number of variables occurring to this degree, we reduce p to 
a polynomial of degree | in each variable. The process preserves the total degree, 
so we get a polynomial q with a term ax)x....X,, say. Now replace q by 
q(....X,....) —g(....0....)(1 <i < n) to get rid of terms not involving x; 


To illustrate the proposition, suppose we have an algebra A satisfying the identity 
20) (7.5.4) 


Here we do not restrict our algebra to be unital (for otherwise it would have to be 


trivial, by (7.5.4)). Then A also satisfies (x + y)” - a al = 0, i.e. 
xy + yx = 0. (7.9.5) 


292 Noetherian rings and polynomial identities 


and this is multilinear. In characteristic other than 2 the identities (7.5.4) and (7.5.5) 
are equivalent, for we can get back from (7.5.5) to (7.5.4) by putting y = x, which 
gives 2x- = 0. 

A multilinear identity has the advantage that to verify it we need only check the 
elements of a basis. Moreover, such identities are preserved under extensions: 


Corollary 7.5.4. Let R be a k-algebra with centre C. If R contains a subalgebra A which 
is a Pl-algebra such that R = AC, then R is a Pl-algebra. 


Proof. By Proposition 7.5.3, A satisfies a multilinear identity p(x)..... Xi) = 0. Let 
{1,} be a k-basis for R and put a; = >} ° a@,,u;, where a;;, € C. Then 

Plays cs, a= Sian, Oh: Dit ett) 0: 
by multilinearity, thus p vanishes on R. fal 


In special cases we can make the assertion of this corollary more precise. 


Proposition 7.5.5. Let A be a finite-dimensional algebra over an infinite field k. Then 


any polynomial identity p(x,.....x7) = 0 for A also holds for Ax, for any extension 
field E of k. 

Proof. Let e,...., é, be a k-basis of A, take a field F obtained by adjoining dn inde- 
pendent indeterminates ¢;,(i7=1,....d.j=1..... n) to k and put x, = > tje,. 


Then in A; we have 


jo, alee ae 8 De: tes (7.5.6) 


where the f, are polynomials in the t;; with coefficients in k. By hypothesis, f, 
vanishes identically on k: if a;=)lajje(aj,¢ek), then S°) filaje; = 
Playxscds ai) = 0; hence f,.(@;;) = 0 for all @;; in k. Since & is infinite, f, vanishes 
identically, and by (7.5.6), p =: 0 in Ag for any extension field E of k. at] 


In the opposite direction from Proposition 7.5.3 one can show that every PI- 
algebra satisfies an identity in two variables. To prove this fact we shall need the fact 
that every free algebra of rank at least 2 contains a free subalgebra of countable rank. 
This is easily verified: in k(x, y) the elements z, =xv"(n=1.2....) are free, 
because in any equation f(z)..... z,) = 0 we can equate homogeneous components, 
and if }°u;z; = 0, then S~ u;xv' = 0, and it follows that 1, = 0, so we reach the 
conclusion by induction on the degree. 


Proposition 7.5.6. Every Pl-algebra satisfies an identity in two vaniables. 
Proof. If p(x).....x7) =0 is a non-trivial identity in a k-algebra A, then so is 


2) . . . eo. 
DOV IN ehntne xy") = 0, and as we have just seen, this is non-trivial. a 


One of the main results in PI-theory, Kaplansky’s theorem, asserts that a primitive 
PI-algebra is finite-dimensional; this will be proved in Chapter 8, where primitive 


7.5 Pl-algebras 293 


rings are discussed. For the moment we shall show that the n x m matrix ring over a 
commutative ring satisfies the standard identity S>,, (Amitsur—Levitzki theorem). The 
proof uses exterior algebras; we recall that the exterior algebra on a vector space V is 
the algebra generated by V with the defining relations »» = 0 (v € V). If V has the 
basis v)..... v, then a basis for the algebra is given by the elements v,,... 1, 
(i) <...<1,,1 <r< 1m) (see BA, Section 6.4). We shall also need an elementary 
result on traces. 


Lemma 7.5.7. Let K be a commutative Q-algebra and A € N,(K ). If tr(A’) = 0 for 
Pa ).2::.a8 then A’ =O. 


Proof. Suppose first that K is an algebraically closed field (necessarily of character- 
istic 0) and let A be an nxn matrix over K, with eigenvalues A,...., A,. The 
characteristic polynomial of A is 


} 


det(AI —A) =x" + e,x" '+...4+ 6. (7.5.7) 

where the c; are (except for sign) the elementary symmetric functions of the A’s. 

Since we are in characteristic 0, we can express c).....C, as polynomials in the 
power sums of the 4’s, s, = Cae = tr(A’), r= 1.....n (Newton’s formulae): 

6S feincias) (7.5.8) 


where s, = tr(A") and f; is of weight i and with rational coefficients. 

Now let K be as stated in the lemma, and A the given matrix. Its characteristic 
polynomial is given by (7.5.7), where the c; are stil] given by (7.5.8) in terms of 
the s, (since these equations are identities in the entries of A). By hypothesis, 
tr(A") =0 for 1 <r<n, hence (7.5.7) reduces to x", and the Cayley-Hamilton 
theorem (which clearly holds for any commutative ring) shows that A” = 0. a 


Theorem 7.5.8 (Amitsur—Levitzki, 1950). Let K be any commutative ring and 
Ay. Dantes Ary, S K,,. Then 

S3,(A).....Ao,) = 0. (7.5.9) 
Thus S0t,,(K ) satisfies a polynomual identity of degree 2n. 


Proof. (S. Rosset) Suppose first that K is a Q-algebra and let E be the exterior algebra 
on the free K-module of rank 2n, with basis u,..... u>,, say. In E consider the matrix 


= ae. 
l 


For any r = 1. 2,... we have 
Ae SMe A, iy, Aw. AU, (7.5.10) 
In particular, Av" = S3,(A,..... Ary, )u;y A... A toy, SO we have to prove that 


A-" = 0. Let E, be the subalgebra of E of terms of even degree; then Ep is com- 


294 Noetherian rings and polynomial identities 
mutative and by (7.5.10), A> € Mt,(Ep), so we need only show that tr(A~’) = 0, for 
r=1,....n, by Lemma 7.5.7. By (7.5.10) this will follow if we show 

TES Acie Am)) = 0, where m1 is even. (7.5.11) 


Let S be the symmetric group on 1, ....m and T the stabilizer of | in S; then a trans- 
versal of T in S is 1, t,....7'~', where t = (1, 2,....m). Every element of S is 
uniquely expressible as t'o, where 0 <i <_m and o €T; for t' brings | to the 
right place and o then permutes 2...., m as needed. Hence we can write 


Sii{Ay yrs Ap) = ye sen(t O)Ajrig Sas a7, eee 


Now t has the effect of permuting the factors cyclically, which leaves the trace 
unaffected, hence 


tr(S,,(A, eee Ais) = >: sen(t') SS sen(a)tr(A jcA20 oe & Amo) 


Here the second sum is independent of i, and )— sgn(t') = 0, because m is even, so 
(7.5.11) follows, and this proves (7.5.9) when K is a Q-algebra. Therefore it holds for 
a polynomial ring over Z (which can be embedded in a Q-algebra), and hence for any 
commutative ring (which is a homomorphic image of a polynomial ring). B 


We note that the bound 2 in this result is best possible: 
Lemma 7.5.9 (Staircase lemma). Let A be a K-algebra with | 4 0. Then 30,,(A) satis- 


fies no polynomial identity of degree less than 2n. 


Proof. If A satisfies an identity of degree r < 2n, then it also satisfies a multilinear 
identity of degree r. Let this be p = 0, where each term of p consists of x)..... % 
in some order. Thus p has the form 


p=ax)x....x%, +p. (7:5.12) 


where @ € kand pis the sum of products of the x’s in other orders than that shown. 
Now the matrix units in A satisfy 


Ci (C [9622034 Chinen Clg eV, 


while the product in any other order is 0, and this applies even if we only take the 


first r, where r > 1. Hence if we put x) = e)), X. = @;2, X3 = @o).... then the first 
term in (7.5.12) 1s we, for some i, while all other terms vanish, so p does not 
vanish on A,, and we have reached a contradiction. i 
Exercises 


1. Show that a polynomial which vanishes identically on a non-trivial algebra must 
have zero constant term. 

2. Let R be a prime Pl-algebra, satisfying an identity of degree d. Show that the left 
(or right) uniform rank of R is < d. 


7.6 Varieties of Pl-algebras and Regev’s theorem 295 


3. Show that if a Q-algebra with 1 satisfies the standard identity S,,, | = 0, then it 
also satisfies S>,, = 0. 


4. Show that S,([x. y]. [x-.y],.... [x", y]) = 0 holds in 9,,(k) but not in M,,; (Kk), 
if k is infinite. 
5. Show that Der oan ey =) '(- 1)'*!x,S( (X), aN yis Deane ws Ora): 


Deduce that S,, = 0 implies that S,, = 0 for all m > n. 

6. Let R be a central simple algebra of degree m over an infinite field as centre. Show 
that R satisfies S>,, = 0 but no identity of degree < 2n. 

7. Let R, S be such that 90,,(R) & Mt,,(S) and R is commutative. Show that S is also 
commutative and deduce that R = S. (Hint. Apply the standard identity with 
suitable arguments including ae,), be;).) 

8. Explain the name of Lemma 7.5.9 (keeping in mind matrix notation). 


7.6 Varieties of Pl-algebras and Regev's theorem 


Let I be any subset of the free algebra F = k{X) and denote by V(J) the collection of 
all k-algebras on which all the members of I vanish identically. It is clear that V(J) 1s 
a variety of algebras; in fact the countably generated algebras in V(J) are just the 
homomorphic images of F/t, where t is the ideal of F generated by all elements 
obtained by substituting elements of F for the variables in members of I. To 
obtain another description of F/t we need some definitions. 

Let A be a k-algebra and Y a generating set of A. The algebra A is said to be 
relatively free on Y if every mapping Y — A extends to an endomorphism of A 
(necessarily unique). For example, the free algebra F is relatively free on X, since 
as we know, every mapping from X to any algebra A extends to a homomorphism 
F —» A, and we need only take the special case A = F. 

An ideal t of the free algebra F = k({X) is said to be fully invariant or a T-ideal if it 
admits all endomorphisms of F. By the freeness of F this means that if an element 
f(x) € F belongs to t, then f(a) € t for all possible ways of replacing x; € X by 
a, € F, Given any subset I of F, the ideal generated by all elements obtained from 
I by replacing x, € X by a, € F in all possible ways is the least T-ideal containing 
I; this is also called the T-ideal generated by I. Now the ideals defining relatively 
free algebras are precisely the T-ideals: 


Proposition 7.6.1. A k-algebra A is relatively free on a set Y if and only if A = F/t, 
where F is the free k-algebra on a set X equipotent with Y and t is a T-1deal. 


Proof. Any k-algebra A with generating set Y can be written as a homomorphic 
image of a free algebra F on a set X equipotent with Y: 


AF/t. (7.6.1) 


where t is the ideal of relations in A. Suppose that A is relatively free and that 
4: FA is a surjective homomorphism with kernel t, whose restriction to X 
defines a bijection with Y. Given any mapping gy: X — F, there is a unique endo- 
morphism of F agreeing with yw on X, which we may again denote by g. We have 


296 Noetherian rings and polynomial identities 


to show that y maps t into itself. By hypothesis the mapping A” 'yA : Y — A extends 
to an endomorphism g of A; thus xgA = xAg for all x € X, hence yA = dg holds on 
all of F. If pet, then pA=0, hence pyA = pAag=0 and it follows that 
py € ker A = t, which shows that A = F/t. 

Conversely, assume that A = F/t, where t is a T-ideal and let a mapping 
g:Y + Abe given. We define gy : X — F as follows: given x € X, choose an element 
u of F such that uA = xAg and put u = xy. The mapping g extends to an endo- 
morphism of F, which will again be denoted by g. Since t is a T-ideal, if p € t, 
then py € t; thus g can be factored by A to give an endomorphism h of A such 
that g = Ah. For any x € X we have xgA = xAh = xdg, hence h is an endomorphism 
of A which agrees with g on Y, and this shows A to be relatively free. + | 


Our aim in what follows is to prove that the tensor product of two PI-algebras is 
again a Pl-algebra. This is Amitai Regev’s theorem; the proof given here is due to 
Victor Latyshev. Some preparation is necessary. 

Consider a permutation o of 1.2....,. We use o to define a partial ordering on 
the set {].....#} by writing 1 < 7 whenever i <j and io < jo. Let us recall that an 
antichain in a partially ordered set is a subset of pairwise incomparable elements, 
and the width of the set is the maximum number of elements in an antichain. 
For example, for our permutation o an antichain of d elements is a set of numbers 
I} <in <...< ay such that jo > hao >... > iyo. We also recall (BA, Theorem 
ioe 


Dilworth’s theorem. In any finite partially ordered set S, the minimum number of 
disjoint chains into which S can be decomposed is the width of S. o 


Given a permutation o, suppose that the corresponding partially ordered set can 
be decomposed into d chains. Then to specify o we need only give the d chains and 


ee tea. al 


their images under o. For example, the permutation ( 
3 5 8 | 4 7 6 2 


defines a partially ordered set of width 4, and it may be expressed as the unions of the 
chains {1, 2, 3}, {4,5, 6}. {7}, {8} with images {3, 5, 8}. {1.4, 7}. {6}. {2}. Dilworth’s 
theorem allows us to estimate the number of permutations of a given width: 


Theorem 7.6.2. The set of permutations of 1,2,.... n for which any set of d numbers 
(2 < d <n) contains at least one pair in their natural order is at most (d —1)””. 


Proof. Let o be such a permutation and consider the partial ordering defined by o (as 
above) on {],2..... n}. The hypothesis states that no antichain has d elements, so 
the set has width less than d, and by Dilworth’s theorem it can be written as a disjoint 
union of at most d — 1 chains. Let us number these chains from 1 to 6, where 5 < d. 
To specify o we have to give the distribution of 1. 2,..., 1 and their images under o 
over these 6 chains. This can be done by defining two mappings from {1,.... n} to 
{1,.... 65}. For each mapping there are 5" choices, hence in all there are 5°" choices 


7.6 Varieties of Pl-algebras and Regev'’s theorem 297 


and so the number of permutations satisfying the given conditions is at most 


(d _ ee | + | 
Let F = k{X) be the free k-algebra on X = {x,,x2....} and denote by L,, the sub- 

space of all multilinear forms in x), ....x,; L, is spanned by the monomials 
Naka se GS (7.6.2) 


hence its dimension over k is n!. We fix an integer d, 2 < d < n, and call a monomial 
(7.6.2) good if the partial ordering defined by o has width < d; by Theorem 7.6.2 the 
number of good monomials (7.6.2) does not exceed (d — 1)". We shall use this fact 
to bound the dimension of a relatively free algebra; here it will be convenient to 
restrict attention to multilinear elements. 


Proposition 7.6.3. Let t be a T-ideal in the free algebra F = k(X), and put t, =tOL,, 
where L,, 1s the space of multilinear forms of degree n, as above. If t contains a poly- 
nomial of degree d, where 2 < d <n, then 


Bor eee alec cw ae (7.6.3) 


Proof. Let us take the monomial basis in L,, with the lexicographic ordering. By 
linearization it follows that F/t satisfies a d-linear identity: 


Viy2-- Sa = > dats where a, € k. (7.6.4) 


sl 


For the proof it will be enough to show that L,, is spanned (mod t) by the good 
monomials. Suppose that u is a monomial which is not good. Then we have a 
factorization 


at Cee Ce ae Comme 


where a; > Q@2 >... > ay. If in (7.6.4) we put y, = Xe, .- Xs. ¥d = Xay» we obtain 
ve a A,v; (mod t,), 


where v; comes before u in the lexicographic ordering. Repeating the process if 
necessary we can after a finite number of steps reduce u to a linear combination 
of good monomials. Now the conclusion follows by Theorem 7.6.2. ia 


In Proposition 7.6.3 we have obtained in (7.6.3) a bound for [L,,/t, : k] of the 
form f(d)", and so we have f(d)°" < n! = [L,, : k] for large enough n. This estimate 
now allows us to achieve our aim: 


Theorem 7.6.4 (Regev’s theorem, 1972). The tensor product of two Pl-algebras over 
a field k 1s again a Pl-algebra. 


298 Noetherian rings and polynomial identities 


Proof. (Latyshev) Let A, B be Pl-algebras over k; clearly there is a polynomial identity 
holding in both A and B. Let t be the T-ideal of all identities holding in both A and B 
and suppose that t contains an identity of degree d > 2. From Proposition 7.6.3 we 
know that [L,,/t, : k] < (d —1)°" forall n > d. Let us fix n > d and take a monomial 
basis mj(x),.... m,.(x) of L,,/t,, where v = [L,,/t, : k]. 

For any permutation o of 1,2,...,m we have 


, oe oe a es Y Alo) m,(x) (mod t,,). (7.6.5) 


In particular, this relation holds in A and B. Now consider a multilinear polynomial 
f(x) ae Ss Vora -Xnas (7.6.6) 


with coefficients y,; to be determined, in such a way that f = 0 is an identity for 
A ® B. By multilinearity we need only check f on a spanning set. Given a;,.... 
a, € A, by,.... b,, € B, we have 


f(4) @ di, ..54) BB) =D VoAig @ dig -- Ano @ Bro 


= ye (x s(ovio)re) mtn @ m,(b), 


oe ol 
by (7.6.6) and (7.6.5). To ensure that f = 0 on A @ B, we have to solve the equations 


So A(OA(O)Y. = 0. (7.6.7) 


These are »- homogeneous linear equations in the n! indeterminates y, and they 

oe : ve y 
have a non-trivial solution when nm! > v°. By Proposition 7.6.3, 1 < (d — 1)”; 
now d is given and for large enough n we have 


pee te ee ae 


For such n the equations (7.6.7) have a non-trivial solution y? and then f = )¢ y!'x, 
is the required polynomial identity for A @ B. o 


Exercises 


1. Show that an algebra A is universal for homomorphisms into some family % of 
algebras iff A is relatively tree. 

2. Show that if two algebras A, B satisfy the same polynomial identity of degree d, 
then A @ B satisfies an identity of degree at most (d — 1)". (Hint. Use Stirling’s 
formula: (1 — 1)! ~ n"e~"\/(27/n).) 

3. Show that an algebra satisfying x- = 0 also satisfies xyz = 0. 

4. Show that if A, B satisfy x° = 0, then A @ B satisfies x° = 0. Find an identity for 
A @ B when A satisfies x” = 0 and B satisfies x’ = 0. 


7.7 Generic matrix rings and central polynomials 299 


7.7 Generic matrix rings and central polynomials 


As we have seen in Section 7.6, a relatively free algebra may be characterized by the 
fact that it is universal for homomorphisms to a given class of algebras. We now 
define the generic matrix ring of degree n in x)..... Xq. 


Fon a k(x), ce er 


as the k-algebra on x),....x,, which is universal for homomorphisms into n x n 
matrix algebras over commutative rings. The elements of F,,,, may themselves be 
thought of as matrices, so that F,,, is a ring of mx n matrices, generated by 
X),...,Xqg and every mapping x, !— a, € M,,(A), where A is commutative, can be 
extended to a unique k-algebra homomorphism. 

An explicit construction of F,,,, is obtained as follows: we adjoin dn” commuting 
indeterminates t,,,(6 = 1.....d.i.j=1,....m) to k and in the n x n matrix ring 
M,,(k[tijs]) consider the subalgebra generated by the matrices x, = (tjj;). Since the 
t,j) are commuting indeterminates, the universal property is easily verified. When 
d= 1, k(x)),,, 1s just the polynomial ring in one variable, but for d > 1 (and 
WS WR sin os X4)(n, IS nOn-commutative, since n x mn matrices over a non-trivial 
ring do not all commute. On the other hand, F,,,, satisfies certain identities; since 
it is a subring of 9%,,(A), where A is commutative, it satisfies, for example, the stan- 
dard identity S),, = 0. Thus we may think of F,,, as the k-algebra generated by 
x,;.....Xy subject to all the identities holding between n x n matrices. This is 
expressed more precisely in 


Proposition 7.7.1. Let F = k(t, .... tu) be the free k-algebra, F = k(x, ....Xa)4,) the 
generic matrix ring of degree n and v: F — F,,, the k-algebra homomorphism in which 
t,x, Then p € F vanishes identically on every n x n matrix over a commutative 
k-algebra if and only if py = 0. 


Proof. If p vanishes on every m xn matrix ring, then in particular, pv = 0. 
Conversely, if py = 0, then since every homomorphism gy: F > Mt,,(A) (A commu- 
tative) can be factored by v, say gy = vy’, we have py = pry’ = 0. + | 


Thus if p € F and we want to check whether p = 0 holds in all matrix rings, we 
need only find its image in F,,,). Here it is often convenient to embed the coefficient 
ring k[tjj,| in an algebraically closed field K. Then we can transform any matrix over 
K with distinct eigenvalues to diagonal form. Now the generic matrix x; = (tij;) 
certainly has distinct eigenvalues, since we can specialize it to any other matrix. 
Hence we can always transform x, to diagonal form; of course the same applies to 
X.....X,, but we cannot transform more than one of the x’s simultaneously to 
diagonal form, because they do not commute. 

It turns out that F,,,, can be embedded in a skew field; this follows from 


Proposition 7.7.2. The generic matrix ring k(x\,....Xa)(n) 18 a left and right Ore 
domain. 


300 Noetherian rings and polynomiai identities 


Proof. We have already seen that F,,, is a Pl-algebra; if we can show that it is an 
integral domain, the desired result will follow by Corollary 7.5.2. 

It remains to show that F,,,, is an integral domain; the essence of the proof will be 
to show that any polynomial identity holding in F,,,, also holds in a certain division 
algebra. Let K be the field of fractions of the coefficient ring k/t,,5] and let E be an 
extension field of K with a K-automorphism a@ of order n, e.g. we may take 
E=K(é,....,&,) to be a rational function field and q@ a cyclic permutation of the 
E's. Let D = E(z: a) be the skew field of fractions of the skew polynomial ring. As 
we have seen in Example 6 of Section 7.3, this is a central division algebra of 
degree n over its centre C, so C is infinite. Let L be a splitting field of D; then 


D ®c Li Li = Ky, ®r L. ie7 AL) 


Here the left-hand side contains D, while the right-hand side contains K,,. Now let 
f.g € Fi, and suppose that fg = 0. Then fg vanishes identically on L,, and hence 
(by (7.7.1)) on D. Since D is a skew field, it follows that for each choice of arguments 
either f or g vanishes, so if y is a new indeterminate, then 


hd Corer Xa)yg(X1. 0... Xa) = 0, (7.7.2) 


identically in D. By Proposition 7.5.5, this also holds in D,; & L,, and hence in K,,, but 
K,, is simple, hence prime, so either f = 0 or g = 0, as elements of F,,,,. This shows 
F.,,, to be an integral domain. a 


From this result it follows that F has a skew field of fractions, called the generic 
division algebra of degree n over k. We shall see below (in Corollary 7.7.5) that its 
degree over its centre is n. 

These concepts have been used by Shimshon Amitsur [1972] to prove that not 
every division algebra is a crossed product. It can be shown that if the generic divi- 
sion algebra of degree n were a crossed product, with Galois group I, then every 
division algebra of degree n would be a crossed product with group I’. Now Amitsur 
constructs two division algebras of degree n (for a certain n, and in characteristic 0) 
which cannot be expressed as crossed products with the same Galois group. It follows 
that the generic division algebra cannot be a crossed product (see Rowen (1988)). 

In studying the centre of a PI-algebra it would be useful to have ‘centre-valued’ or 
‘central’ polynomials, i.e. polynomials p which when evaluated, always yield elements 
of the centre. By a central polynomial for n x mn matrices one understands a poly- 
nomial p € k(x,,....: x2) which when evaluated in 9Jt,,(A), where A is a commutative 
k-algebra, takes values in the centre of 9%,,(A). Of course we are only interested in 
non-constant polynomials, i.e. polynomials taking at least two values. 

As an example consider 2 x 2 matrices over R. A commutator of 2 x 2 matrices 


has the form a = (° 


) for its trace must be 0, and hence 
¢ —~a 


: ) ) 
a7 = , 
0 a~ + bc 


7.7 Generic matrix rings and central polynomials 301 


This is a scalar, hence (xy — yx)” is a central polynomial for 2 x 2 matrices. This was 
essentially the only non-constant central polynomial known for many years, but a 
large tamily of central polynomials for all values of n was discovered in 1972 by 
Edward Formanek. In 1974 Yuri Razmyslov discovered multilinear central poly- 
nomials for m x n matrices, and we shall now describe his construction. We preface 
the main theorem by two remarks. 

Let k be a field; the matrix ring k,, may be considered as an n--dimensional vector 
space with a bilinear form tr(xy). The usual matrix basis {e;,} has the dual basis {e;;}, 
for we clearly have 


trl eich) = 0 00;F: 


For simplicity we shall index the e;; by a single suffix A:e,(A=1..... n-) and 
denote the dual basis by e}. We observe that for any matrix a = )_ ajje;, we have 


) e,ae; = ) COKeKeic = ya, €,, = tr(a). 


If {f,}.{f 3} is another pair of dual bases for k,,, say f, = Do Ciu€ws C2 = Df Cuas 
then >) fiaf? = do Oue.af} = do e,,ae*, and this proves the formula 


ys e,ae; =tr(a).1 for any dual bases {e;}. {e5} of ky. (12) 


Secondly, put F= k(x,,....- x,) and consider the subspace ® of elements that are 
homogeneous of degree | in x,;. These elements can be written in the form 
> a,x\b,, where a;, bj € k(x2,....x,4), and the space ® admits the linear mapping 


) A;X|b; > ) b.x)a,. 


because ) a; ® b, = 0 + )_— b; ® a; = 0. In general rings there is no reason for this 
to hold, but it does hold when x is a generic matrix: 


Lemma 7.7.3. Given any k-algebra R, let x,; be n” commuting indeterminates over k 
and write x = (x,,). Then the R-bimodule generated by x in IN,,(R @ kl xjj}]) admits 


the k-linear mapping 
Ay : 3 a;xb; {-—> > b:xa,. (7.7.4) 


Proof. The subspace of matrices linear in the x;; has the R-basis x;;e,,. Let us define 
(X,,€r,JAy = Xsrej3 then (7.7.4) holds for a) = e,;, b; = ej, for then a,xb) = xjje,s, 
b\xa, = X;,e,,. Hence it holds generally, by linearity. CI 


This transformation A, is called the Razmyslov transposition. We observe that if u 
is linear homogeneous in x, then 


r(v(uA,|,._. ,)) = tr(al (7.7.5) 


ce 


where x —> y indicates that x is to be replaced by y. For u = pxq the equation reduces 
to tr(yqp) = tr(pyq), which holds by the cyclic symmetry of the trace; hence it holds 
generally by linearity. 

We can now state the main result; instead of A, we write A,. 


302 Noetherian rings and polynomial identities 


Theorem 7.7.4 (Razmyslov). Let 


C= Cig 2550 Vor ey = ) sen(G) YaXia¥l --- ¥en - 1Xn2a)ns 


a 


where o ranges over all permutations of 1...., n°. Then 
DE Des cc BE Vis eS ) RA CAt lec) 
is a non-constant central polynomial on SN,,(k); more precisely, on St,,(k), 


De ye 2) = 12 tr Cel (7.7.6) 


Proof. Let {e,} be a basis for k, and denote by E we me of fractions of the poly- 
nomial ring in the commuting indeterminates x,’ i eS, ‘ a= =:0. 1: 
n°). We shall prove the theorem by taking x; = So x''e,. y= ays “e;, and 
obits (7.736) 11 Es. 
In the first place we note that tr(C) 4 0. For if we take the x; to be the e,, in some 
order, we can choose the y, so that 


VYoX}V 1X2 26 Xe Vn- = E11: 


while all the other terms in C are 0, and with these values tr(C) = 1. 
It is clear from the definition that C vanishes when we put x; = x;, where 1 + J. 
Hence by (7.7.5), 


0 eee ee 
“(2x,( (CA, ie ~_ ) o iC |, seo = : a : 
Gy af 4S 


It follows that up to a scalar (non-zero!) factor {(CA,|x; > 1)} is a dual basis for 
{x,}. So by (7.7.3), 


Dis YC Ales rStrtz)tr GE yi. 


as we wished to show. | 


The polynomial C = C,, is called the Capelli polynomial and D=D,, is the 
Razmyslov polynomial for n x n matrices. By Theorem 7.7.4 it is a non-constant 
central polynomial on SWt,,(k) whose value is relatively easy to calculate, using the 
formula (7.7.6). 

As Claudio Procesi has observed, any central polynomial g for n x n matrices, 
with zero constant term, vanishes identically on S,,(k) for m <n, for we can 
regard any m x m1 matrix as an n x n matrix whose last m — m rows and columns 
are 0. Now the value of y must be central, i.e. a scalar, and this scalar is 0, as we 
see by looking at the (m, )-entry. More generally, let A be a simple algebra of 
degree m with centre k. If E is a splitting field, then A; = A ®E = E,, and since 
D,, is multilinear, it vanishes on A whenever m <n. This proves 


7.8 Generalized polynomial identities 303 


Corollary 7.7.5. The Razmyslov polynomial D,, is central and non-vanishing for any 
central simple algebra of degree n, while it vanishes identically on central simple algebras 


of degree less than n. is 


It is clear that D,, does not vanish on the generic division algebra F,,,,, whereas 
D,, =, does; this shows F,,,, to be of degree ». We shall return to this point in Section 
8.5, when we come to discuss prime PlI-algebras. 


Exercises 


]. Show that the exterior algebra A(V ) on any vector space V is a Pl-algebra, but if V 
is infinite-dimensional, A(V ) does not satisfy a standard identity and so cannot 
be embedded in a matrix algebra over a commutative ring. 

. Show that in a prime Pl-algebra every non-zero ideal contains a central regular 
element. 

3. Show that Corollary 7.7.5 holds for any central polynomial with zero constant 
term. (Hint. Use Proposition 7.5.5 and treat the case of a finite ground field 
separately. ) 

4, Show that [tr(ax)b]A, = a.tr(xb), where A, is the Razmyslov transposition. 

5. (Razmyslov)Show that the commutators [a.b] = ab— ba span an (n- — 1)- 
dimensional subspace of k,, and use this fact to prove that (in the notation of 
Theorem 7.7.4) B= C(x. [tol val...., [tty Vie] ¥o.---.¥y2) = tr(x)p, where p 
is a matrix depending on u, v, y. Deduce that BA, is again a central polynomial. 

6. Let f be a polynomial in the entries of a square matrix A over a field. Show that if 
fis unchanged when A is replaced by P~ AP, then fis a symmetric function of the 
eigenvalues of A. 


to 


7.8 Generalized polynomial identities 


In all the polynomial identities considered so far the variables commute with all 
the elements of the ground field; in other words, our identities were obtained by 
equating elements of k{X) to zero. However, one may wish to consider situations 
where the variables do not centralize the ground ring; this leads to identities obtained 
by equating elements of the tensor ring A;,(X) to zero. We recall that the tensor A-ring 
A,(X), where A is any k-algebra, is defined as the ring generated by a set X over A 
subject to the defining relations ax = xa, where x € X, a € k. 

Let A be any k-algebra. By a generalized polynomial identity (GPI) in A one under- 
stands a non-zero element of the tensor A-ring A;{X) which vanishes under all map- 
pings X —> A. Shimshon Amitsur [1965] has shown that a primitive ring R satisfies a 
GPI iff R has a non-zero socle and the endomorphism ring D of a simple R-module is 
finite-dimensional over its centre. The main difference from Kaplansky’s theorem (as 
regards the conclusion) is that this time the degree of the identity does not provide a 


304 Noetherian rings and polynomial identities 


bound on the dimension. This can be illustrated by the example of the n x n matrix 
ring over a commutative ring, which satisfies a GPI of degree 2: 


C1) XC; YE) — €1; Vey) Xe; — O, 


whereas an ordinary identity has degree at least 2n, as we saw in the staircase lemma 
(Lemma 7.5.9). An even simpler example is given by a non-prime ring, which always 
satisfies a GPI of degree 1: axb = 0 (for suitable a,b #0). We shall confine our 
attention to the special case of Amitsur’s theorem where R is a skew field; thus we 
shall essentially prove that a skew field satisfying a GPI is finite-dimensional over 
its centre. This then will also provide an independent proof of Kaplansky’s theorem 
for the case of a skew field. We begin with two lemmas. 


Lemma 7.8.1. Let D be a skew field, = a multiplicative subset of D and K its centralizer 


in D. If a,,b; € D* (1=1,....1) are such that 
Gab: =0 forallxe &, (7.8.1) 
then a,,....a, are right linearly dependent over K. 


Proof. Since a;, b; 4 0, we have a,xb, 4 0 and it follows that n > 1 in any relation 
(7.8.1). Taking » minimal, we may assume that no such relation exists for a proper 
subfamily of a;,...,a,. Now any element of D can be written in the form )° ujb; 1; 
for uj € 2, yj € D, j= 1.....7, because Xb;D = D. For i= 1,...,n we define a 
mapping y; : D — D by the rule 


Vis Yo ujbiy; > Y ujbiv,, where u; € X, v; € D. (7.8.2) 


To show that y, is well-defined, we have to verify that 


»~ ujb\v; =O > oa OS! AS ee ces n). 


Suppose then that }°u,b\v; =0. For any xe © and any j= 1,....r we have 
>, ajxujbj = 0 by (7.8.1), hence 


0=-— So ayxibyy = ps vs) 


] i. } 


This holds for all x € © and is a shorter relation than (7.8.1), hence the coefficient of 
each a; must vanish, ie. >> ujb\v; = 0, fori = 2,.. ,n, as we wished to show. 

This shows the mapping (7.8.2) to be well-defined. It is right D-linear, i.e. 
yilzc) = y;(z)c for all c € D. Taking z = 1, we find that y;(c) = y,.c is left multiplica- 
tion by an element y;. Moreover, y; is also left =-linear: yw = wy; for all w € &, 
hence y; € K. By definition, y;b, = b,, hence 


= Ss a0. = Yo axyib = (>> aiyi) xb. 


Since b, 4 0, we have 5° a,y, = 0, and here y, = 1, so the a’s are linearly dependent 
over K. EJ 


7.8 Generalized polynomial identities 305 


Lemma 7.8.2. Let D, &, K be as in Lemma 7.8.1. Suppose that a,.....a, € Dare right 
linearly independent over K and b)..... b,€D are such that the set 
E = {> - ajxb;|x € X} is contained in a finite-dimensional left K-space. Then there 
exists ¢ € D* such that Kc& is finite-dimensional as left K-space. 


Proof. Let 14),.... u, be a left K-basis for a space containing E, so that 
S| ajxb, a S Aix) uj for all x € X& and some A,(x) € K. 
] 


Here we may take hb; = 1 by multiplying by b, ' on the right. We shall use induction 
on n; for n= 1 we have ajx= S°A,(x)u,, so uy..... u, is a K-basis for a space 
containing Ka and the conclusion holds with c = a). 

Assume now that n > 1. For any y € © we have 


sy aix(b,y ~ yb,) = SoA (x)uiy = » A, (xv)uy. 


If biy A y,b for some y € X and some 1, we can apply induction on # to reach the 
conclusion. Otherwise b,y = yb, for all y¢ X and so b; € K; hence S°a,xb; = 


( $° a;b;)x and we are reduced to the case n = I. nr 


We now come to the main result of this section. Here the restriction to multilinear 
elements is necessary because & may not admit sums. 


Theorem 7.8.3 (GPI theorem). Let D be a skew field, = a multiplicative subset of D 
and K its centralizer in D. If for all cE D“, Ke& is infinite-dimensional as left 
K-space then any non-zero multilinear element of D(X) has a non-zero value for some 
choice of values of X in %. 


Proof. Let f be a non-zero multilinear polynomial in Dg (X) of degree n, say, which 
vanishes on &. We single out the terms in which x, occurs last and write 


f = Yo gixiby ar ecu (7.8.3) 


where b; € D and no term in any q; has zero degree in the x’s. We may again take 
b, — 1 and suppose f chosen so that r and s are minimal. Then for any ) € & 


TAS cap EN eda Yn) = So gin (by sae 278 a > pixlqiy — ¥qy). 
Ri: J 


(7.8.4) 


Since r was chosen minimal in (7.8.3), 1, b>. .... b, are linearly independent over K, 
in particular, b, ¢ K for i> 1, so none of the terms b;) — yb; in the first sum can 
vanish identically for )€ ©. Choosing y=yp € XY such that byyy $ yob2, we 
obtain a GPI in D with a smaller value of r, unless r= 1, when the first sum on 


306 Noetherian rings and polynomial identities 


the right of (7.8.4) is absent. In the latter case consider q;y — yq;; if this vanishes 
identically for all y € X, then q; € K for all values of the x's in &. Write q; = 
\°cjx.d,, where the values of x3. ...X, In Z are chosen so that q; # 0 (which is 
possible, by induction on ). Then the set {>> cjyd,|y € X} is one-dimensional 
over K, hence Kc® is finite-dimensional for some c € D*, by Lemma 7.8.2, in 
contradiction to the hypothesis. 

There remains the case where the q,y — yq, do not all vanish identically; we shall 
show that this leads to a contradiction. In this case the left-hand side of (7.8.4), for 
suitable » = yy) € Dis a non-zero polynomial f,, again multilinear in x,,..... X,, With 
no term in which x, is last. Moreover, each term in f, has the x’s in the same order as 
some term inf, so if x, does not come last in any term of f, then the same is true of f). 
We apply the same reduction to x». .... x, in turn and finally obtain a polynomial f* 
in which no x, comes last. This is impossible, so this case cannot occur. | 


If in this theorem we choose © = D, then K is the centre of D and we obtain the 
skew field case of Amitsur’s theorem: 


Theorem 7.8.4 (Amitsur). Let D be a skew field with centre C such that [D: C] = x. 
If f € DU-(X) 1s non-zero, then f(a) 4 0 for some a € D%. 


Proof. We need only observe that a non-zero polynomial f leads to a non-zero 
multilinear polynomial by the linearization process of Proposition 7.5.3, which 
clearly still applies to generalized polynomials. a 


The above proof uses simplifications by Wallace Martindale and Yitz Herstein, see 
Herstein (1976). 


Exercises 


1. Let R be a simple ring with centre C. Show that Lemma 7.8.1 holds for D = R, 
2 =R, K=C. Deduce that if a,b € R satisfy axb = bxa for all x € R, and 
a+ 0, then b = ia for some A € C. 

. Let k be a field of characteristic 0. Show that in the Weyl algebra A,(k) with 
generators u, v every non-trivial multilinear polynomial f is non-zero when the 
variables in f are replaced by suitable powers of u and 1. 

3. Let G be a group and A(G) the set of elements of G which have only finitely many 
conjugates in G. Show that A(G) is a characteristic subgroup of G. Show further 
that if the group algebra kG has a GPI holding for all arguments in G, then A(G) 
is of finite index in G. 


iw) 


Further exercises on Chapter 7 


1. Let f be a homomorphism from R to a skew field K. Show that f is an epi- 
morphism iff K is the subfield generated by im f. Show also that if f is injective, 
then K is left R-flat iff R is a right Ore domain. 


7.8 Generalized polynomial identities 307 


to 


. Show that the skew field of fractions of a right Ore domain is unique up to an 


isomorphism leaving R fixed. (Note that this does not extend to general rings, 
e.g. a free algebra of rank at least two has many non-isomorphic skew fields 
of fractions, see Exercise 2 of Section 7.3 and Cohn (1985), (1995).) 


. Let M be a cancellation monoid. Given a,b € M, if ad 1 bM = @, show that 


the submonoid generated by a and b is free on these generators. 


. Let R be a ring with total quotient ring Q. Show that Q is semisimple whenever 


every right Q-module is injective as right R-module. (Hint. Take a right ideal in 
Q, form its complement C as right R-module and verify that C is a Q-module.) 


. Let k be an algebraically closed field and let A be the translation ring over k, 


generated by x, y with xy = y(x + 1) (see Section 7.3, Example 4). Verify that 
yA is a prime ideal and its complement is an Ore set. Show that any prime 
ideal other than 0 or yA has the form cy = yA +(x—a@)A, where a € k. 
Verify that ycg = ¢,:;yv and deduce that the complement of ¢, is not an Ore 
set, but the complement of ,,c,., is an Ore set, for each a € k. 


. In the k-algebra with generators x.y and defining relation xy = Ayx, where 


4 €k”, find the maximal prime ideals and the complements of intersections 
of prime ideals that are Ore sets. (Hint. Treat the case where A is a root of | 
separately. ) 


. (Ore) Let k be a field containing a finite subfield F,. Show that the elements of 


k(x] which as functions on k are linear over F, are the q-polynomials )° ajx", 
and that they form a ring under substitution as multiplication: 
fe(x) = f(g(x)). Verify that this ring is isomorphic to k[z: g], where y : aia’. 


. (P. Fatou) A power series over Z is called primitive if no prime divides all its 


coefficients. Show that the product of primitive power series is primitive. 
Deduce that if P,Q ¢€Z[x] are coprime polynomials such that the power 
series P/Q has integer coefficients, then Q(0) = +1. (Hint. Find polynomials 
f.g € Z[x] such that fP + gQ = mis a positive integer and express m as a pro- 
duct of Q and another series in Z[[x]].) 


. Let R be a right Bezout domain. Show that any finitely generated torsion-free left 


R-module is free. Deduce that over a 2-sided Bezout domain every finitely 
generated module splits over its torsion submodule. 


. Let R be a right Ore domain and K its field of fractions. Show that any ring 


between R and K is again right Ore. 


. Show that a semiprime right Goldie ring satishes the minimum condition on 


right annihilator ideals. (Hint. In any chain the rank becomes stationary; now 
use (a) = (c) in the proof of Theorem 7.4.9 and L.2 to show that essential exten- 
sions are trivial.) 


. A module is called semi-Artinian if every non-zero quotient contains a simple 


submodule. Show that a semi-Artinian module is non-singular iff its socle is 
projective. 


. (A. R. Kemer) Ifthe symmetrizer h,, corresponds to the polynomial F,,(x,.....X,) 


by the rule }) a,0 +> 90 agXoX2q-..Xng Show that F, is the linearization of 
Sa iets Xy,)..-Sy,(X),-...Xn,), where the S; are standard polynomials and 
aw has columns 1)... ,. 


308 


14, 


Noetherian rings and polynomial identities 
(P. J. Higgins) Write P(x. y) = $2 ?x'yx"~ 1 ~'. Show that if an algebra A over a 
field of characteristic prime to nm! satisfies x” =0, then it also satisfies 
> Xi --+Xng = 0, where o runs over all permutations of 1,.... n, and deduce 


-n-1l-3 


that P(x.y) =0 in A. By evaluating the expression > x'zy/x 
two ways, show that A satisfies the identity x”! 


ye oy in 


gyn Sa). 


. (P. J. Higgins) Use Exercise 14 to prove the Nagata—Higman theorem: if an 


algebra of characteristic prime to n! satishes the identity x" =0, then 
A?’ ~'=0, (Hint. Let I be the ideal generated by all elements a”’', where 
a € A. Show that n!ZAI = 0 by Exercise 14 and apply induction on # to A/T.) 


Rings without finiteness 
assumption 


For general rings there is naturally not as much structure theory as in the Artinian 
or Noetherian case. It is true that some of the same methods can be used, e.g. the 
radical can be defined, semiprimitive rings can be expressed as subdirect products 
of primitive rings etc., but these methods are less precise and they do not lead to 
a complete classification. For primitive rings a structure theorem can be proved 
using a general version of the density theorem; this is presented in Section 8.1 and 
applied in Section 8.2, while Section 8.3 deals with semiprimitive rings. So far we 
have taken the existence of a unit element for granted, but some work has been 
done on ‘rings without one’ and we cast a brief glance at it in Section 8.4; we 
shall examine the case of simple rings and also see when the existence of a ‘one’ 
follows from other assumptions. In Section 8.5 we study semiprime rings, and in 
Section 8.6 we present an analogue of Goldie’s theorem for Pl-rings. The final 
section, Section 8.7, takes a brief look at a natural generalization of principal ideal 
domains: free ideal rings. 


8.1 The density theorem revisited 


One of the basic results of ring theory is the Wedderburn—Artin theorem: a simple 
Artinian ring is a matrix ring over a skew field (BA, Theorem 5.2.2). A related 
result is the density theorem (Theorem 5.1.1), which for a simple ring A finite- 
dimensional over its centre k tells us that A” @ A is a full matrix ring over k. In 
1945 Nathan Jacobson (and independently, Claude Chevalley) proved a far-reaching 
generalization, as part of his theory without finiteness assumptions. Our object is to 
present this result, but we begin by examining a particular case, the endomorphism 
ring of a vector space. 

Let K be a skew field and V a left K-module; as is well known (see BA, Theorem 
11.1.5), Vis free as K-module and any two bases of K have the same cardinal, called 
the rank or also the dimension of V over K and written |V : K |. Let E = Endg(V) 
be its endomorphism ring; when |V:K] is finite, equal to nm say, then 
E = Endg(K") = 0t,(K ), the » x m matrix ring over K, and this is a simple ring 
(BA, Theorem 5.2.2), We now ask: what can we say about E when V is infinite- 
dimensional? In this case E is no longer simple; its ideal structure is described in 
Theorem 8.1.3 below. 


310 Rings without finiteness assumption 


The first step is to find a matrix representation for E. Here it is not necessary for K 
to be a skew field; we may take any ring R and consider a free R-module of infinite 
rank v. Thus we take an index-set I of cardinal v and take a free left R-module V with 
basis {v,} (@ € I). In terms of this basis any endomorphism a of V is described by the 
equations expressing the image of each v, in terms of the v’s: 


Voa = YS aap™p. where dug € R. (8.1.1) 
B 


Here (ayy) is a v x v matrix, i.e. a square array of elements of R, whose rows and 
columns are indexed by a set I of cardinal 1. Moreover, for each @ € I, there are 
only finitely many non-zero coefficients ay; in (8.1.1), hence each row of the 
matrix (da,,;) contains only finitely many non-zero entries; we say: it is row-finite. 
Conversely, every row-finite v x v matrix over R defines an endomorphism of V rela- 
tive to the basis {vy}. For we can define v,a by (8.1.1), and for a general element 


x= >> &,vy of V put 
xa = S Edun. 


It is easily checked that the mapping a so defined is an endomorphism of V, so that 
we have a bijection between End;(V) and the set 3t,(R) of all row-finite v x v 
matrices over R. If we define the addition and multiplication of row-finite matrices, 
as in the finite case, by the formulae 


(up) + (Bap) = (dap + Beeps). (Aap) (Bap) = (Yo urbe 


we find that 93t,(R) is a ring isomorphic to E. We observe that some restriction on 
the matrices, such as row-finiteness, is essential for the product to be defined. Our 
conclusion may be stated as 


Theorem 8.1.1. Let R be anv ring and V a free left R-module of infinite rank v. Then 
Endr(V ) ts isomorphic to the ring of all row-finite v x v matrices over R. fi 


Let us return to the case of a skew field K. When V is a finite-dimensional K-space, 
say [V:K] =n, then Endy(V) = Mt,,(K ) is a simple ring, and it can be written as 
a direct sum of n pairwise isomorphic simple left ideals (BA, Theorem 5.2.2), cor- 
responding to the columns of the matrix. In the infinite case the situation is rather 
different. We still have the minimal left ideals, corresponding to the columns of the 
matrix, but we can no longer express the general matrix as the sum of a finite 
number of such columns, and Endx(V) is no longer simple. 

For any a € Endx(V) let us define the rank of a, p(a), as the K-dimension of the 
image space: 


p(a) = [ima: K]. 


This agrees with the usual definition in the finite-dimensional case, and we have the 
following rules: 


8.1 The density theorem revisited 311 


R.1 p(a) is a cardinal satisfying 0 < p(a) < |V: K], 
R.2 p(a)=0 8 a=0. 

R.3 pla — b) < pla) + p(b), 

R.4 p(ab) < min{p(a). p(b)}. 


Of these, R.1 and R.2 are clear, and R.3 follows because V(a — b) C Va + Vb, and so 
[Via-—b):K] <[Va:K]+[Vb:K]. Turning to R.4, we clearly have Vab C Vb, 
hence p(ab) < p(b), but also [Vb: K] < [V:K]; hence on replacing V by Va we 
find that [Vab: K] < [Va: K] and so p(ab) < p(a). 

To elucidate the ideal structure of Endx(V), we note the following relation 
between ranks of endomorphisms: 


Lemma 8.1.2. Let K be a skew field and V a left K-space. If a.b € Endx(V) and 
pla) = p(b), then there exist p.q € Endx(V) such that 


b = paq, (8.1.2) 
and p(p) = p(q) = ptb). 


Proof. Choose complements N,.N, of ker a, ker b in V respectively, so that 
V=kera@N, =kerb@N,. Then N,=ima, N;=imb, so by hypothesis, 
[N, :K]>[N,:K]; hence there exists p € Ends(V) mapping ker b to 0 and 
embedding N, in N,; clearly p(p) = p(b). If {u,} is a basis of N;, then the up are 
linearly independent in N, and the u,pa are linearly independent, because the 
restriction a|N,, is injective. Likewise the u,b are linearly independent in Vb. Now 
choose a complement L in V for the subspace spanned by the 1,,pa, and define q 
as the endomorphism mapping L to 0 and u,pa to u,b. Then p(q) = p(b) and 
(8.1.2) holds, for both sides map u, to u,b and ker b to 0. + | 


With this preparation we can describe the ideal structure of Endx(V) : 


Theorem 8.1.3. Let K be a skew field and V a K-space of infinite dimension v. For any 
infinite cardinal ju denote by E,, the set of all endomorphisms of V of rank < js. Then 
the E,, (for 2 < v) are distinct, each E,, is an ideal in E = Endgx(V ), and these are the 
only ideals apart from 0 and E. 


Proof. Let a.b € E,,, where yz is an infinite cardinal. Then 
PaO) = pla) +plb)<2-= fh, 


(by BA, Proposition 1.2.7), hence a—beE,. Next, if ae E, and ce€E, then 
plac) < pt, p{ca) < yt, by R.4, and this shows E,, to be an ideal in £. There are endo- 
morphisms of any rank < v, e.g. projections on subspaces, hence all the E,, are 
distinct. It remains to show that there are no other ideals. 

Let a be a non-zero ideal in E and denote by yz the least cardinal > p(a) for all 
aeéa. Then yp is infinite; for a contains endomorphisms of positive rank, hence 
(by Lemma 8.1.2) a contains all endomorphisms of rank 1. But every endomorphism 
of finite rank can be written as a sum of endomorphisms of rank 1, so a contains all 
endomorphisms of finite rank and u must be infinite. By definition, each a € a has 


312 Rings without finiteness assumption 


rank < yp, hencea C E,,. Conversely, if b € E,,, then p(b) < yz, hence p(b) < pla) for 
some a € a. By Lemma 8.1.2, b = paq for some p.q € E, so b € a, and this shows 
that a = E,,. Thus every non-zero ideal in E is of the form E,, for some infinite 
cardinal y, and clearly if 2 > v, then E,, = E. | + | 


This result shows in particular that (for infinite [V : K ]) Endg(V) is never simple. 
However, it has another property; as we shall see in Section 8.2, End, (V) is primi- 
tive, and there is a close relation between primitive rings and such endomorphism 
rings, which is described in Theorem 8.2.3. 

We now come to the density theorem in its general form; our presentation follows 
Nicolas Bourbaki. We begin by describing the bicentral action on a module. 

Let R be any ring and M a right R-module; the action of R on M may be described 
by saying that we have a homomorphism of R into End(M), each a € R correspond- 
ing to an endomorphism a’ of M, qua additive group. The centralizer S of this set 
R' = {a |a € R} in End(M) is the ring of all R-endomorphisms, S$ = End,(M), 
and we shall regard M as left S-module. Since (ax)a = a(xa) for allx Ee M, ae R, 
a € S, by definition of S, we see that M becomes an (S, R)-bimodule in this way. 
Let T be the centralizer of S; this is again a subring of End(M ), called the bicentralizer 
of R. Clearly for any a € R, a’ centralizes S and so lies in T, i.e. R C T. If equality 
holds, we say that R acts bicentrally on M, or also that Mg is bicentral. 

As an example consider the ring R itself. As is well known and easily proved (see 
BA, Theorem 5.1.3), the centralizer of ,R is the set of all right multiplications; by 
symmetry the centralizer of Rx is the left of all left multiplications, hence xR as 
well as Rp is bicentral. For this property the presence of a unit element is of course 
material. 

The definition of ‘bicentral’ may be restated as follows: Given a right R-module M, 
R acts bicentrally if for each 6 in the bicentralizer T there exists a € R such that 


x8=xa forall x e M. (8.1.3) 


This is a very strong requirement, because a in (8.1.3) is independent of x. To obtain 
a weaker condition, let us say that R acts fully on M if for each 6 in the bicentralizer T 
and each x € M there exists a € R such that x# = xa, where a may depend on the 
choice of x € AM. But the most useful condition is intermediate between these two. 
We shall say that R acts densely on M if for each 4 € T and each finite family 
x).....X, € Mf there exists a € R such that 


Oa: Ob es oa) 1. (8.1.4) 


This reduces to the definition in Section 3.1 when the centralizer of R is a field k. 
It is clear that bicentral = dense => full. The next result gives a useful condition 
under which dense © bicentral. 


Proposition 8.1.4. Let M be an R-module with centralizer S. If M 1s finitely generated 
as S-module, then R acts densely on M if and only if it acts bicentrally. 


Proof. Let M =) = 'Su;; given @ in the bicentralizer, choose a €R such that 
WO Saye = lees n) and consider the set N = {x € M|x@ = xa}. This is clearly 


8.1 The density theorem revisited 313 


an S-submodule of M and it contains the generating set u)..... u, of M, hence 


N = M, 1. x6 = xa for all x € M, so R acts bicentrally. The converse is clear. 


To motivate the terminology we remark that End(M ) may be regarded as a set of 
mappings from M to M, ie. a subset of M“!, and this set can be topologized as a 
product, taking the discrete topology on M. The topology so defined on End(M) 
is called the topology of pointwise convergence. Given f € M*', we obtain a typical 
neighbourhood of f by taking a finite set x).....- x, € M and considering all 
f’ €M™ such that x;f’ =x:f(i= 1....,”). Now (8.1.4) may be restated by saying 
that R is dense in its bicentralizer T (i.e. every non-empty open subset of T contains 
an element of R). In this connexion we note that any centralizer and hence any 
bicentralizer is closed in the topology defined here on End(M). 

Let M be a right R-module, denote the centralizer (acting on the left) by S and the 
bicentralizer by T. As is easily verified, the centralizer of "Mx is S, (BA, Corollary 
4.4.2), so that we may regard "M as left S,-module, and the centralizer of this 
module is T (BA, Theorem 4.4.6). Thus we have 


Proposition 8.1.5. For any’ R-module M and anv n > 1, M and "M have isomorphic 
bicentralizers. | 


To say that R acts densely on M means that for all n and all x € "M, # € T, there 
exists a € R such that x6 = xa. This just states that R acts fully on "M, for all »; thus 
we have proved 


Theorem 8.1.6 (Density theorem). Let M be any right R-module. Then R acts densely 
on M if and only if R acts fully on "M, for all n > 1. 


We give several applications, which show the power of this result. In the first place, 
we clearly have 


Corollary 8.1.7. If R acts densely on a module M, then it acts densely on "M, for any 


n> 1. Ci 
An important case of dense action is provided by semisimple modules. 


Theorem 8.1.8. Let M be a semisimple right R-module. Then R acts densely on M, and 
M 1s also semisimple over the centralizer of M. Moreover, if R is right Artimian, then M 
is finitely generated over the centralizer of R and R acts bicentrally on M. 


Proof. Denote by S the centralizer and by T the bicentralizer of R on M. We first 
show that every R-submodule of M is a T-submodule. Let M, be an R-submodule 
of M; since M is semisimple, we have Af = Af, ® M> for a submodule Mp. Let ¢ 
be the projection on M,; then 6, € S, hence for any t€ T and xe M, xt = 
(6,x)t = 0,(xt), so xt € M, as claimed. 

Next we show that R acts fully on M, ie. for each t € T and x € M there exists 
aéR such that xt = xa. This states that for each x € M, xT C xR. But xR is an 


314 Rings without finiteness assumption 


R-submodule containing x, hence it admits T, by what has been shown, and so 
xT C xR, as required. 

We apply this result to "M. This is again semisimple, hence R acts fully on "M for 
all n, and so, by the density theorem, R acts densely on M. 

To prove that M is semisimple as S-module, we can write M = }— Sx, where x runs 
over all elements of all simple R-submodules, for every element of M is a sum of such 
elements. The conclusion will follow if we show that Sx is simple or 0. Take 
04 y € Sx, say y = sx. Then for any a € R, the mapping xa!-> sxa = ya is a homo- 
morphism xR — yR, which is surjective and yR 4 0. Hence it is an isomorphism; if 
xR = yR, we can write M = xR @ WN and find an R-automorphism of M mapping » 
to x. Otherwise xR M yR = 0 and we can write M = xR ® yR ® N; now it is clear how 
the isomorphism from yR to xR can be extended to an R-endomorphism of M. It 
follows that Sy contains x, hence Sy = Sx and this shows Sx to be simple. Thus M 
has been expressed as a sum of simple S-modules and by omitting redundant 
terms we see that .M is semisimple. 

Finally, assume that R is right Artinian. Let us write M = @Su,; we claim that 


this sum is finite, for if not there is a countable subset: 4). 12..... If we put 
a, = {ae R|uja=...=u,,a = 0}, then a, is a right ideal of R and 
dO aeA. (8.1.5) 


— 


The projection of M on the complement of Su; @...@ Su,, is an S-endomorphism, 
i.e. there exists f € T such that ut =O(i7=1,...."), ujt = uj (1 >). By density 
there exists a © R such that ua=O(i=1..... 1), ,.)@ = U,~1. Hence u,.) € 
a,,\a,,. , and this shows the inclusions in (8.1.5) to be strict. But this contradicts the 
fact that R is right Artinian. Therefore Mf = Su, @...®Su, for some n, and by 
Proposition 8.1.4, R acts bicentrally. 3 


For a simple module we can say rather more. By combining Corollary 8.1.7 with 
Schur’s lemma (Lemma 6.3.1) we obtain a generalization of Wedderburn’s first 
structure theorem (see BA, Theorem 5.2.2): 


Corollary 8.1.9. Let M be a simple right R-module. Then the centralizer K of R is a 
skew field and R acts as a dense ring of linear transformations on M. In particular, if 
R is right Artinian, then the image of R in End(M) acts bicentrally, and so is a full 
niatrix ring over K. 


Proof. By Schur’s lemma the centralizer K is a skew field. When R is right Artinian, 
then M is finitely generated over K, by Theorem 8.1.8, and so is finite-dimensional, 
say M = K" and R acts bicentrally. Hence R as centralizer of K is K,,. 


Exercises 


1. Let E = End(V) and E.. be as in the text, where [V : K] = v is infinite. Verify 
that E/E, is simple but not semisimple. Show that for a.b € E there exists 
p such that b= pa iff im b Cima and there exists q such that b = aq iff 
ker b > kera. Deduce that E is neither Artinian nor Noetherian; prove the 


8.2 Primitive rings 315 


same for E/E,.. (Hint. Consider the set of endomorphisms with image in a given 
finite-dimensional subspace, or with kernel containing a given finite-dimensional 
subspace. ) 
. In E=Endx(V) show that the endomorphisms with image in a given one- 
dimensiona] subspace form a minimal left ideal and that all such left ideals are 
isomorphic. Show further that the sum of all these left ideals is the unique mini- 
mal ideal in E (it is the socle of E). 
3. In the notation of Theorem 8.1.3 show that E,,, as algebra without 1, has as ideals 
precisely the E, (A < yz) and 0. 

4, Show that for any vector space V over a field k, End,(V) is a regular ring. 

5. Show that in any monoid the centralizer of any subset is its own bicentralizer. 
Deduce that for any R-module M, Endz(M ) acts bicentrally on M. 

6. Let A be a finitely generated abelian group, as Z-module. Show that Z acts 
bicentrally on A. 

7. Show that a simple ring acts bicentrally on any right ideal. 


tru 


8.2 Primitive rings 


Let R be any ring and M a right R-module. We say that R acts faithfully on M or that 
M is a faithful R-module if for any a € R, a #0, we have Mu 4 0. This means that 
the standard homomorphism R — End(M ) is injective. A ring R 1s called primitive 
if it has a simple faithful right R-module. Strictly speaking this type of ring should be 
called ‘right primitive’ and a corresponding notion ‘left primitive’ should be defined. 
In fact these concepts are distinct, as examples show, but we shall only be dealing 
with right primitive rings and so omit the qualifying adjective. 

To obtain an internal characterization of primitive rings, we define for any right 
ideal a of R its core as the set 


(a: R)= {xe R|Rx Ca}. (8.2.1) 


If we regard M = R/a as aright R-module, (a: R) is the annihilator of M in R. This 
shows the core to be an ideal in R; moreover (a : R) C a and any (two-sided) ideal of 
R contained in a is contained in (a: R), by the definition; thus the core of a is the 
largest ideal of R contained in a. With its help primitive rings can be characterized 
as follows: 


Proposition 8.2.1. A ring R is (right) printitive if and only if R contains a maximal 
right ideal whose core is zero. 


Proof. If R is primitive, there is a faithful simple right R-module M. We have 
M = R/a, where a is a maximal right ideal (by the simplicity of M), and since M 
is faithful, (a: R) =0. Conversely, if a is a maximal right ideal with zero core, 
then R/a is a faithful simple right R-module. | + | 


Corollary 8.2.2. A commutative ring is primitive if and only if it 1s a field. 


316 Rings without finiteness assumption 


Proof. In this case the core of a is a itself, so 0 must be the maximal ideal of R and 
this means that R is a field. | 


To give an example, any simple ring (Artinian or not) is primitive, for R has a 
maximal right ideal by Krull’s theorem (see BA, Theorem 4.2.6) and its core is a 
proper ideal, which must be 0. The converse does not hold: if V is an infinite- 
dimensional vector space over a field and EF is its endomorphism ring, then E acts 
faithfully on V and V is clearly simple as E-module, so E is primitive, but as we 
saw in Section 8.1, E is not simple. The next result describes primitive rings more 
precisely: 


Theorem 8.2.3. Any primitive ring is isomorphic to a dense ring of linear trans- 
formations in a vector space over a skew field K. Conversely, any dense subring of 
End, (V ) is primitive. 


Proof. Given a primitive ring R, let V be a simple R-module on which R acts faith- 
fully. Its centralizer K is a skew field, by Schur’s lemma, and R is naturally embedded 
as a dense subring in End(V), by Corollary 8.1.9. For the converse we need only 
observe that any dense subring R of End(\’) acts simply: given u.v € V, u 40, 
there exist a € R such that ua =1. Hence the R-submodule generated by any 
u 01s V, ie. V is simple. a 


AS we saw, a primitive ring need not be simple, but if R is right Artinian as well as 
primitive, say it is a dense subring of End,x(V), then V is finitely generated over K, 
by Theorem 8.1.8, hence V = K” and R acts bicentrally: R & Endg(K") & K,,. This 
expression is unique up to isomorphism of K (BA, Theorem 5.2.2). In the general 
case there is no such uniqueness, but we have the following consequence which is 
sometimes useful: 


Proposition 8.2.4 (O. Litoff). Let R be a primitive ring which is not right Artinian. 
Then for every n>1, R has a subring with a homomorphism onto a full nx n 
matrix ring over a skew field. 


Proof. We may take R to be a dense subring of Endy(V), where V is an infinite- 
dimensional vector space over K. Given n > 1, take an n-dimensional subspace U 
of V, with a basis u,....,u,. Let R, be the subring of R mapping U into itself: 
every element of R, defines by restriction an endomorphism of U, thus we have a 
homomorphism R, > Endg(U) & K,,, and this is surjective, by density, so it is the 
required homomorphism. O 


Let R be a ring with a minimal right ideal and define the socle s of R as the sum of 
all minimal right ideals. This socle is an ideal, for, given any minimal right ideal a of 
R and any x € R, then xa is a minimal right ideal or 0 and so is contained in s. 
Primitive rings with non-zero socle have a more precise description: 


Theorem 8.2.5. A primitive ring has a non-zero socle if and only if in its representation 
as a dense ring of linear transformations of a K-space, R contains transformations of 


8.2 Primitive rings 31/ 


finite (non-zero) rank. When this is so, all faithful simple right ideals of R are 
isomorphic and the skew field K is determined up to isomorphism as the centralizer 
of a faithful simple right R-module. 


Proof. If there is an element of R defining a linear transformation of finite rank, 
take c € R such that the rank p(c) is the least positive number possible. Then 
ker cx Dkerc for any xe R, and if cx £0, then p(cx) = p(c), hence ker ex = 
ker ¢, and the complement of ker c is finite-dimensional. By density we can find 
y € R such that cxy = c; this shows cR to be minimal, hence R has a non-zero socle. 

Now assume that R has a non-zero socle, and hence a minimal right ideal a. If M 
is any faithful simple right R-module, then Ma # 0, so ua 4 0 for some u € M. But 
ua is a submodule of M, hence ua = M and so the mapping x!> ux (x € a) is a 
surjective homomorphism a— M. By the minimality of a,M a as right R- 
module, so a is also faithful; this shows that every simple faithful right R-module 
is isomorphic to M. It follows that the isomorphism type of K is uniquely determined 
as the endomorphism ring of a. Finally, since a is faithful, a’ 4 0, hence a~ = a, and 
so ca =a for some cé€a. We claim that p(c) = 1; for if not, then there exist 
x.y € V, where V is the K-space on which R acts, such that xc, yc are linear indepen- 
dent over K, and by density there exists b € R such that xcb 4 0, ycb = 0. Then 
a =cR meets the annihilator n of y in R, and n is a right ideal, so a Cn, by the 
minimality of a. But this means that ye = 0, which is a contradiction, and it 
shows that R contains elements of rank 1. | 


By contrast, simple rings with minimal right ideals are much more special: 


Proposition 8.2.6. Any simple ring with minimal right ideals is Artinian. 


Proof. Let R be a simple ring with minimal right ideals. The sum of ali minimal right 
ideal is the socle, a two-sided ideal, which coincides with R, by simplicity. Thus R is a 
sum of simple right R-modules, hence a direct such sum, and since R is finitely 
generated (by 1), this direct sum is finite. Thus R is right Artinian. Now R is also 
semisimple as left R-module, hence it is also left Artinian, and so it is an Artinian 
ring. MO 


Exercises 


l. For any ring R and any n > 1, show that R,, is primitive iff R is. More generally, 

show that being primitive is a Morita invariant. 

Show that every minimal right ideal in a primitive ring has an idempotent 

generator. (Hint. Use the primitivity to show that the right ideal is not nilpotent.) 

3. Show that if e is a non-zero idempotent in a primitive ring R, then eRe is 
primitive. 

4. Show that for any idempotent e in a ring R, eR is a minimal right ideal iff Re is a 
minimal left ideal. Deduce that in a primitive ring with non-zero socle, the socle 
coincides with the left socle (defined correspondingly). 

5. Show that the socle of a primitive ring is a minimal two-sided ideal. Deduce that a 
simple ring with non-zero socle is Artinian (Proposition 8.2.6). 


tv 


318 Rings without finiteness assumption 


6. Show that a left Artinian primitive ring is simple. What can be said about a (right) 
primitive ring with minimal left ideals? 

7. Show that the centre of a primitive ring is an integral domain. If A is any 
commutative integral domain with field of fractions K, show that the set of 
infinite matrices over K which are equal to a scalar in A outside a finite square 
is a primitive ring with centre A. 


8.3 Semiprimitive rings and the Jacobson radical 


In the last section we defined primitive rings as rings with a faithful simple module. 
Often one needs to consider a wider class of rings, and we define a ring to be sevpi- 
primitive if it has a faithful semisimple module; clearly it is equivalent to require that 
for each a € R” there exists a simple module M such that Ma + 0; for if such a 
module M, is chosen for each a, then 5° M, is faithful and semisimple, and con- 
versely, if M is faithful and semisimple, then each a # 0 acts non-zero on at least 
one simple summand. 

We recall from BA (Lemma 5.3.2) that J(R), the Jacobson radical of R, is defined 
in the following equivalent ways: 


J(R) is the set of all a € R such that 

(a) Ma = 0 for each simple right R-module M, 
(b) a belongs to each maximal right ideal, 

(c) 1 — ay has a right inverse for each y € R, 
(d) 1 — xay has an inverse for all x.y € R, 
(a°)-(d°) the left-hand analogues of (a)-(d). 


It is clear from (a) of this definition that a ring R is semiprimitive precisely when 
J(R) = 0. Moreover, the symmetry of the definition shows that the notion ‘semi- 
primitive’ is left-right symmetric. This is in contrast to the situation for primitive 
rings, where one-sided examples exist, first found by George Bergman in 1964 and 
Arun Jategaonkar in 1968. 

We also recall that a quasi-inverse of an element c is an element c’ such that 


c+ec’=cc =Cc. (8.3.1) 


The quasi-inverse, when it exists, is unique, because 1 — c’ is the inverse of 1 —c. 
An element which has a quasi-inverse is sometimes called quasi-regular; for example, 
any nilpotent element is quasi-regular. Now J(R) may also be defined as the largest 
ideal consisting entirely of quasi-regular elements; this is an easy consequence of 
(d) above. 

Semiprimitive rings admit a subdirect product representation which is sometimes 
useful; however, it should be borne in mind that a given product may contain many 
different subdirect products, and the relation to the direct product is not very close. 


Theorem 8.3.1. Every semiprimitive ring R 1s a subdirect product of primitive rings 
which are homomorphic images of R. Conversely, every subdirect product of primitive 
rings 1s semuprimitive. 


8.3 Semiprimitive rings and the Jacobson radical 319 


Proof. Let {p,} be the family of all maximal right ideals of the semiprimitive ring R, 
and denote the core of p, by c;.. Since R is semiprimitive, we have Mp, = 0, and since 
c, Cp,, it follows that Nc, = 0. If we put R, = R/c,, then R; is primitive, for it 
is represented faithfully on the simple module R/p,. Now the natural maps 
f,.. R — RK, can be combined to a homomorphism into the direct product 


fo R>P=| IR: (8.3.2) 


and ker f = Nker f, = Mc, = 0. Thus f is injective and if €, : P > R, denotes the 
canonical projection, then fe; =f, is surjective, by the definition of R;, so (8.3.2) 
is the required direct product representation. Conversely, if {R;} is a family of 
primitive rings and M, is a faithful simple R,-module, then for any subdirect pro- 
duct R of the R, and any a € R” we have ae, # 0 for some A, hence a has a non- 
zero action on M, and this shows R to be semiprimitive. | + | 


In particular, when R is commutative, we have by Corollary 8.2.2, 


Corollary 8.3.2. Any commutative semiprimitive ring is a subdirect product of fields, 
and conversely, such a subdirect product 1s primitive. | 


For example, Z is semiprimitive because 0, 2 are the only quasi-regular elements 
and so J(Z) = 0. Hence Z is a subdirect product of fields; in fact Z is a subdirect 
product of the fields F,, where p ranges over all primes. 

The definition of J(R) shows that it measures how far R is from being semi- 
primitive. It is a pleasant (and by no means self-evident) property that R/J(R) is 
semiprimitive. This follows from the next result, itself more general. 


Proposition 8.3.3. Let R be a ring and a an ideal such that a C J(R). Then 


J(R/a) = J(R)/a. (8.3.3) 


Proof. The natural homomorphism R — R/a induces a lattice-isomorphism between 
the lattice of right ideals of R/a and that of all right ideals of R which contain a. Each 
maximal right ideal of R/a corresponds to a maximal right ideal of R; the converse 
also holds because a C J(R) = N{max. right ideals}. Taking intersections of these 
sets of maximal right ideals we obtain (8.3.3). | 


If in (8.3.3) we put a = J(R), the right-hand side reduces to 0 and we deduce 
Corollary 8.3.4. For any ring R, R/J(R) is semiprimitive. Le 


We note that (8.3.3) may not hold without restriction on a, for the mapping 
J(R) — J(R)/a is not generally surjective, e.g. if R= Z, a= (4), then J(Z) = 0, 
but J(Z/4) £ 0. 

It is well known (see BA, Section 5.3) and easily checked that the Jacobson radical 
contains all nil ideals and for an Artinian ring R, J(R) is nilpotent, although in 
general J(R) need not even be nil, e.g. in the power series ring k[{x]] the Jacobson 


320 Rings without finiteness assumption 


radical is (x). However, in the absence of nil ideals we obtain a semiprimitive ring by 
adjoining an indeterminate. 


Proposition 8.3.5 (Amitsur). If R is a ring with no non-zero nil ideals, then R{t] is 
semiprinutive, where t is an indeterminate. 


Proof. We have to show that the radical J of R{t] is 0. If J 40, let a be the set 
consisting of 0 and all leading coefficients of elements in J. It is clear that a is 
an ideal in R and the conclusion will follow if we prove that a = 0. Let a, € a, 
say f=a,)x"+...€J. Then fteéJ and so there exists g € Rit] such that 
(1+ g)(1 —ft) = 1, ie. 


SS Sear SS eee a oe ee 
for all r > 1. Hence we obtain 
Cae aft ea ep ee 


Let us take r > deg g and equate the coefficients of terms of degree r(n + 1). On the 
right there is no contribution, while on the left we have (1 + g)a}, therefore a} = 0. 
Thus a is a nil ideal, hence a = 0 and so J = 0, as we had to show. P| 


We can now also prove the basic theorem on PI-algebras: 


Theorem 8.3.6 (Kaplansky’s theorem, 1948). Let R be a primitive Pl-algebra, with a 
polynomial identity of degree d. Then R is a simple algebra of finite dimension n over tts 
centre, where n<d/2. More precisely, if V is a simple faithful R-module and 
D=Endp(V), then R= M,,(D), where m= |V: DI. 


Proof. Let p be a polynomial of degree d which vanishes on R; by Proposition 7.5.3 
we may take p to be multilinear. Since R is primitive, there is a simple faithful R- 
module V; we identify R with its image in End(V ) and put D = Endg(V ). By Schur’s 
lemma D is a skew field, and by the density theorem (Theorem 8.1.6) R is dense in 
End;,(V). If [V : D} is finite, we have R = Endp(V) and the result follows; other- 
wise, by Proposition 8.2.4 we can for any mm, find a subring of R mapping onto 
D,,, so Dy, again satisfies p = 0. By the staircase lemma (Lemma 7.5.9) it follows 
that d>2m, so we have a bound on m. Hence [V:D] =m<d/2 and 
R = Endp(V) = Dy. 

In particular, this shows that R is simple and its centre is the centre of D, a field C. 
Let K be a maximal commutative subfield of D containing C; we have a natural 
homomorphism R @ K — RK, but the left-hand side is simple, hence R@.: K = 
RK. Now RK is a K-algebra, its centre is K and it acts densely on V, therefore 
RK = K,, where again n < d/2. Moreover, [R: C]} =|[RK: K]=1r. | 

We record separately the special case of a skew field: 


Corollary 8.3.7. A skew field which satisfies a polynomial identity of degree d is of finite 
dimension < [d/2]> over its centre. EA 


8.3 Semiprimitive rings and the Jacobson radical 321 


The result can be extended to semiprimitive rings as follows. We remark that any 
ring R can be embedded in the full matrix ring R, (e.g. as scalar matrices), hence R,,, 
can be embedded in R,, whenever m\n. 


Corollary 8.3.8. Any semiprimitive Pl-algebra satisfying an identity of degree d can 
be embedded in a matrix algebra N,(A) over a commutative ring A, where r < [d/2]. 


Proof. By Theorem 8.3.1, R is a subdirect product of primitive rings R;, where R;, is 
a homomorphic image of R and hence again satisfies an identity of degree d. By 
Theorem 8.3.6, R;, is a simple algebra of degree # over its centre, where n < d/2. 
By taking a splitting field FE, we can thus embed R,; in a matrix algebra 9%, (E;.). 
The least common multiple of the degrees n,; which occur is r < [d/2]! and R, 
can also be embedded in M,(E,). Now |] 0t,(E,.) = 9,(A), where A =|], and 
so we have an embedding of R in SM,(E). Ps 


If R is a Pl-algebra without non-zero nil ideals, then R[t] is semiprimitive, by 
Proposition 8.3.5 and it is again a Pl-algebra, by Corollary 7.5.4, hence we obtain 


Theorem 8.3.9. Let R be a Pl-algebra without non-zero nil ideals. Then R can be 
embedded in IN,,(A), where A is a commutative ring, and if R satisfies an identity of 
degree d, then n < |d/2}!. P| 


For Artinian rings ‘semiprimitive’ reduces to ‘semisimple’; this follows from the 
fact that for an Artinian ring R, R/J(R) is semisimple (see BA, Theorem 5.3.5), 
and it will also be derived in a more general context below in Section 8.4. In the 
Artinian case the radical may be described as the intersection of all maximal two- 
sided ideals. This does not hold generally: we clearly have, for any ring R, 


M{max. left ideals} = O{max. right ideals} C M{max. ideals}, (8.3.4) 


where the first equality holds by the characterization of J(R). For an example where 
the inclusion is strict, take R = End; (V), where V is an infinite-dimensional vector 
space over a field K. We have seen in Section 8.2 that R is primitive, hence J(R) = 0, 
but the ideals in R form a chain, so the intersection on the right is the unique max- 
imal ideal of R. Let us denote by 5» the socle of R; in the representation on V this 
corresponds to the set of elements of finite rank. Assuming [V : K ] to be countable, 
with basis {e,}, let us write a, for the set of elements of R mapping e; to 0. Then a, is a 
maximal right ideal and 8, Z a,, Na, = 0. Similarly, if 6, is the set of elements of R 
mapping V into }/._ ; Ke;, then b; is a maximal left ideal such that so Z 6;, Vb, = 0. 
By Krull’s theorem R also has maximal right ideals (and maximal left ideals) con- 
taining S, but the proof is non-constructive, and there is no obvious procedure 
for finding them. 


322 Rings without finiteness assumption 


Exercises 


1. Show that a subdirect product of semiprimitive rings is semiprimitive. 

Show that the Jacobson radical of a ring contains no non-zero idempotent. 

Deduce that every regular ring is semiprimitive. 

3. Verify that in an Artinian ring the intersection of all maximal two-sided ideals is 
just the radical. 

4. Let R be a ring and a an ideal in R. Show that if J(R/a) = 0, then J(R) Ca. 

5. Show that a subdirect product of a finite number of simple rings is a direct 
product of simple rings. 


tO 


8.4 Non-unital algebras 


So far we have taken the existence of a unit-element or ‘one’ as part of the definition 
of a ring, but there are some occasions when a ‘non-unital’ ring arises naturally. For 
example, the algebra C(X ) of all continuous functions on a topological space X has a 
one precisely when X is compact. We shall maintain the convention that a ring neces- 
sarily has a one, and allow for the case where a one is lacking by speaking of an 
algebra. The algebra is called unital if it has a one. The coefficient ring K (with 1) 
may be any commutative ring; this is no restriction since every ring may be regarded 
as a Z-algebra. 

Let A be a K-algebra; by a right A-module M we understand a (K, A)-bimodule 
with the rule a(xa) = x(@a) for all x € M, ae K, ae A. Even if A has a one, e 
say, this need not define the identity mapping on M. If it does, Le. if 


xe=x forall x ec M. 


the module M is said to be unital. 

Our first observation is that a K-algebra may always be embedded in a unital 
K-algebra. This may be done in many ways; we shall single out one which uses the 
notion of an augmented algebra. A unital K-algebra A is said to be augmented if there 
exists a K-algebra homomorphism, called the augmentation mapping: 


Ee: Am K. 


This means that € is a ring homomorphism such that (@a)¢ = a(ae) for all a € K, 
aeéA. In particular, (we)e = a, hence wet-a is a bijection between K.e (where e 
is the one of A) and K, and we may embed K in A by identifying a with ae. The 
kernel of ¢ is the augmentation ideal of A and we have the direct sum decomposition 


A=zkere@K, (8.4.1) 
corresponding to the decomposition for any x € A: 


x = (x — xe) 4+ xe. 


8.4 Non-unital algebras 323 


For any K-algebra A we can form the augmented algebra A' = A @ K by defining the 
multiplication 


(a.a)(b, B) = (ab+ aB+ ab. aB). 


and this algebra has the unit-element (0. 1) and augmentation ideal A. Conversely, if 
C is an augmented K-algebra with augmentation ideal A, then A' & C, as augmented 
K-algebras. Moreover, the category of A-modules is equivalent to the category of 
unital A'-modules. 

For K-algebras the notion of simple module has to be modified. A right module M 
over a K-algebra A is called simple if MA # 0 and M has no submodules other than 
Q and M. When A has a one and acts unitally, this reduces to the previous definition. 
Our object will be to study simple A-modules in terms of A' (where the results of 
Section 8.3 can be used). 

Any simple A-module M can again be represented as A/I for a maximal right ideal 
I, but we must also have A- ¢ I, to ensure that MA + 0. Further, we can no longer 
use Krull’s theorem to find maximal right ideals because A may not be finitely 
generated as right A-module. To overcome these difficulties, let us look more closely 
at the correspondence between right ideals in A and in A’, 

Let [ be a right ideal of A and I’ a right ideal of A’ such that I’ > J. Then 
I CIMA, so there is a natural homomorphism of A-modules 


ASA ha OAT VE CATT. 


This is an isomorphism iff I’ A = I and 1’ + A = A’. Given any right ideal I' of A’ 
such that A+ J/’ = A', we can put I=1'M 4; then by what has just been said, 
A/I = A'/I’. In particular, this holds for any maximal right ideal I’ of A' which 
does not contain A. If we start from J in A, the next lemma shows under what 
conditions we can find a suitable I’. 


Lemma 8.4.1. Let A be a K-algebra with a right ideal I. Then there 1s a right ideal I’ of 
A! such that 


Dae aa (8.4.2) 
if and only if A contains an element e such that 
(l-—e)A CI. (8.4.3) 


In (8.4.3) 1 is the one of A', but we can express (8.4.3) entirely within A by writing 
a—ea el for alla € A. A right ideal J satisfying (8.4.3) for some e € A is called a 
modular right ideal. 


Proof. If (8.4.2) holds, we can write 1=u+e, where uel’, ee A. Then 
(lL—e)\A=uA CIMA =I, and (8.4.3) follows. Conversely, given (8.4.3), we put 
I’ =1+(1—e)K. Then I’ is a right ideal in A!, by (8.4.3) and I’+ A is a right 
ideal containing (1—e)+1=1, hence I'+A=A'. Moreover, if x=a+t 
(1 — e)a € A, where w € K, a € I, thenaw = 0, hencexel,sol’NA=l. 3 


324 Rings without finiteness assumption 


This lemma shows in particular that for any maximal right ideal I’ of A’ which 
does not contain A.I' A is a modular right ideal of A. 

By a maximal modular right ideal of A we shall understand a maximal member of 
the set of all proper modular right ideals. Any right ideal containing a modular right 
ideal is again modular (by the definition), hence a maximal modular right ideal is 
also maximal in the set of all proper right ideals. Moreover, any proper modular 
right ideal is contained in a maximal modular right ideal for, given I D> (1 — e)A, 
we can by Krull’s theorem find a right ideal containing J but not e, and maximal 
with these properties, and this is easily seen to be modular. In fact the notion of a 
modular right ideal may be regarded as a device for producing maximal right 
ideals in non-unital algebras; the corresponding quotients are simple modules, as 
we shall see below. 

In any K-algebra A let us define the Jacobson radical J(A) as the set of all elements 
of A represented by 0 in any simple A-module. If J(A) = 0, A is said to be semi- 
primitive. For unital algebras these definitions reduce to the earlier ones, by the 
characterization quoted in Section 8.3. Now J(A) can be described as follows in 
terms of the maximal modular right ideals of A: 


Theorem 8.4.2. Let A be a K-algebra, where K is a semiprimitive coefficient ring, and 
denote by A‘ the corresponding augmented K-algebra. Then J(A) = J(A'), and J(A) is 
the intersection of all maximal modular right ideals of A. 


Proof. It is clear that a right A-module M such that MA # 0 is simple iff it is simple 
as A'-module. Thus for each a € A, 


a € J(A) a is represented by 0 in every simple A-module 
& ais janie by 0 in every simple A' — module 
SaAacJ(A)NA 
It follows that J(A) = J(A!) 7 A. Now assume that x € J(.A!)\A; then 
x=at+al. whereaecA,~ae K,x £0. (8.4.4) 


Since K is semiprimitive, there is a maximal ideal m of K not containing w. The resi- 
due class field K/m has a natural K-module structure (by multiplication) and we can 
define an A'-module structure by the rule u.v = u(ye), for ue K/m, vy € A‘. Then 
K/m is a simple A'-module, and if | is the residue class of 1, then by applying the 
element eae) we get 1.x = 1.a 40. This ran ccis the fact that x € J(.A'); 
hence J(A') C A and it follows that J(A!) = J(A). Now 


ae J(A ) a € each maximal right ideal of A 


<> a € each maximal right ideal of A’ which does not contain A 


<> a € each maximal modular right ideal of A, 


by the remarks preceding the theorem. Ea 


8.4 Non-unital algebras 325 
From the definition of the Jacobson radical quoted in Section 8.3 we now obtain 


Corollary 8.4.3. Let A be a K-algebra, where K is semiprimitive. Then J(A) consists of 
all elements a € A such that ay has a quasi-inverse, for all y € A. o 


We note that the intersection of all the maximal right ideals of A is in general 
different from J(A). For example, let k be a field and A the algebra of formal 
power series in x with zero constant term. Then J(A) = A, but the intersection of 
all maximal ideals is xA. More generally, simple algebras have been constructed 
which coincide with their Jacobson radical, by Edward Sasiada in 1961 (see Sasiada 
and Cohn [1967]). 

We can now prove the Wedderburn structure theorem for semisimple rings under 
weaker hypotheses. First an auxiliary result on modular right ideals; we recall that 
two right ideals a, b of A are called comaximal ifa+b =A. 


Lemma 8.4.4. Let A be any K-algebra and a, 6 modular right ideals which are co- 
maximal. Then aM 6 is again modular. 


Proof. By hypothesis there exist e. f € A such that (1 — e)A Ca, (1 —f)A Cb. Since 
a+b =A, by comaximality, there exist a, € a, b; € b (1 = 1.2) such that 


e=a +b. f=ar+bnr. 


Hence for any x € A, 
(a1 +b,))x=bx=ex=x (moda), 
(ao +b))x =ax=fr=x (modb). 


Therefore (1 — (a2 + b}})A Cab and the conclusion follows. oi 


Theorem 8.4.5 (Wedderburn decomposition theorem). Let A be a semiprimtitive 
K-algebra over a semiprimitive coefficient ring K, and assume that A is right Artinian. 
The A is unital and semisimple, as a ring. 


Proof. Let {I} be the family of all maximal modular right ideals of A. Since A is 
semiprimitive, MI, = 0, and since A is right Artinian, the chain 1) D1)NhD... 
breaks off, hence for some r, 


Prieto he (8.4.5) 


By omitting superfluous terms, we can choose this representation to be irredundant, 
i.e. such that J) 9...90J;., ¢ Jj. Since J; is maximal modular, it is comaximal with 
IhN.. O1,.,, so by Lemma 8.4.4, 0 is modular, i.e. (1 — e)JA = 0 for some e € A. 
Hence A(1 — e) is a nilpotent left ideal, which must be 0, by semiprimitivity, so 
x = ex = xe for all x € A and e is the one of A. 

Since each ] is a maximal right ideal, P, = A/I, is a simple right A-module and A is 
a submodule of the direct sum @P,. As submodule of a semisimple module, A itself 
is semisimple as right A-module, hence it is semisimple as a ring. ia 


326 Rings without finiteness assumption 


In Section 8.2 we saw that any simple ring with minimal right ideals is Artinian; 
for non-unital algebras this is no longer always so, and it is of some interest to 
examine the form such algebras take. To begin with we need to clarify what is to 
be understood by a simple algebra; for example, a 1-dimensional vector space A 
over a field with multiplication xy = 0 for all x.y € A has no ideals apart from 0 
and A, but such trivial cases should clearly be excluded. We therefore define an 
algebra A to be simple if A740 and A has no ideals other than 0, A. Now a 
simple algebra with minimal right ideals is again equal to its socle, but the latter 
need not be finitely generated. To describe such algebras more precisely we need 
the notion of a Rees matrix algebra. 

In the rest of this section we shall take all our algebras to be bimodules over a skew 
field K such that x( yz) = (xy)z for any x, y. z in K or in the algebra. Thus they could 
be described as K-rings, were it not for the fact that they will in general lack 1. We 
shall regard them as algebras over the centre of K; an alternative would be to suspend 
the convention that rings have a 1, but we shall not take that course to avoid con- 
fusion. In practical terms this makes no difference, since we shall not have occasion 
to consider the augmented algebra. 

Let C be any algebra and J, A any sets; we shall write “C! for the set of all matrices 
over C with rows indexed by A and columns indexed by J, briefly, A x I matrices. 
Fix a matrix P in *C! which is regular, i.e. whose rows are left linearly independent 
and whose columns are right linearly independent over C. By the Rees matrix algebra 
over C with sandwich matrix P one understands the set M of all I x A matrices 
A = (a,,) over C, with almost all entries zero, with componentwise addition: 


(ay) (bg) = (ay + Oe), 
and with the multiplication 
A*B= APB. where A = (a,;,), B= (b,). 


This is well-defined, since A. B are both [ x A and zero almost everywhere, while P 
is A x J. With these definitions we have the following characterization of simple 
algebras with minimal right ideals. 


Theorem 8.4.6. For any algebra A the following conditions are equivalent: 


(a) Ais a simple algebra with a minimal right ideal, 

(b) A is a prime algebra which coincides with its socle, 

(c) A is tsomorphic to a Rees matrix algebra over a skew field, 

(d) A 1s tsomorphic to a dense algebra of linear transformations of finite rank on a 
vector space over a skew field, 

(a”)-(d") the left-right analogues of (a)—(d). 


Proof. The equivalence (a) < (d) follows as in Section 8.2, so it only remains 
to prove (a), (b), (c) equivalent; the equivalence to (a°)-(d") then follows by the 
symmetry of (c). 

(a) = (b). By hypothesis the socle of A is not zero, so it equals A, and being 
simple, A is prime. 


8.4 Non-unital algebras 327 


(b) = (c). Let t be a minimal right ideal. Since A is prime, t-£0,sor> =tand 
hence t= at for some aé€r. Choose e€r such that ae=a; if e #e, then 
(e7 — e)t =t and we have tr = at = ale’ — e)t = 0, a contradiction. Hence e =e 
and of course e #0, so t = et and e is an idempotent generator. By Lemma 4.3.8 
and Schur’s lemma, End (eA) = eAe is a skew field, K say. Now A, being equal to 
its socle, is semisimple as right A-module. Let s be the sum of all right ideals 
isomorphic to t. This is a two-sided ideal; if s # A, then we have A=s @s' for 
some non-zero right ideal 5’, hence 5’‘5 = 0, and so 5’ = 0, because A Is prime. 
Thus A = ¢ is a direct sum of right ideals isomorphic to t. Now t = eA is a left 
K-space and we can take a basis u, (A € A); likewise Ae is a right K-space with 
basis v;(1€ 1), say; further we note that u,v, € eAe = K. We define a A x I 
matrix P = (p;,) over K by 


Pas = U,V). 


and claim that P is regular. For if (c;,) is a family in K, almost all zero, such that 
>> c¢.p.i = 0 for all i, then S¢c,u;v, = 0, hence > c,u, annihilates Ae and so must 
be 0, and now we have c, = 0 by the linear independence of the u;. Similarly 
)— p.idi = 0 for all A implies d; = 0. 

Let M be the Rees matrix algebra over K with sandwich matrix P and define a 
mapping f : M > A by 


(a,.)f = So vai un. (8.4.6) 


Since almost all the a,, vanish, this is well-defined. We shall establish (c) by proving 
that f is an isomorphism. It is clearly additive, and we have 


Han) (ofS bs Ay, U; vib [f 


— a Vj, U3 Vid), Uy, 
= Lap pCO eT; 


thus fis a homomorphism. It is surjective because AeA is the socle and so equal to A, 
and f is injective because P is regular, for if )>v,a,,u, = 0, then >> p,idapi, = 0, 
hence a,, = 0. Thus f is an isomorphism and (c) follows. 

(c) = (a). Given a Rees matrix algebra M over K with sandwich matrix P, take any 
non-zero matrix C = (c;;,) in M. For any (.7) € A x I consider 


Osi oad ) PyiCir.Pii- 


If b,,; = 0 for all jc, j, then by the regularity of P.c,; = 0 for all 7.4, a contradiction. 
Hence b,,, #0 for some (u,j) € A x I. Let us write E,, for the matrix unit with 
(1,A)-entry 1 and the rest 0. For any d € K we have 


dE* C"E,, = dEj,PCPE), = (feo), 


td 


where 


tie = > AS i PylCiaPajOxp = bk104) db,,,. 


328 Rings without finiteness assumption 


Since d was arbitrary in K, f;,, ranges over K, and every matrix of M is a finite sum of 
terms dE,,, it follows that every matrix of M lies in the ideal generated by C, hence M 
is simple. 

It remains to find a minimal right ideal. By the definition of P, p;, 4 0 for some 
pair (A, 1) EAx I]. Writing Pri = Pp, we have Pp ‘Eap> EB .. =p 'E, PE; p~ Mee 
p ‘E;, 40, hence e=p~'E,; is a non-zero idempotent in M. Now the mapping 
y:K-—>M defined by ci p 'E;,c is an injective homomorphism with image 
eMe, as is easily verified. Hence eMe is a skew field and so eM is a minimal right 
ideal of M. P+ | 


This result is closely analogous to the structure theorem on 0-simple semigroups 
proved by David Rees in 1940. The equivalence of (a) and (c) is due to Eckehart 
Hotzel in 1970; the equivalence of (a) and (d) was proved by Nathan Jacobson in 
1945, building on work by Jean Dieudonné in 1942. 

The generalization of the Wedderburn—Artin theory set forth in Sections 8.2-8.4 
was developed by Jacobson in 1945, The density theorem generalizes a theorem of 
William Burnside to the effect that a monoid acting irreducibly on an n-dimensional 
vector space over an algebraically closed field contains n linearly independent 
endomorphisms; this is the essential content of Corollary 8.1.9. 


Exercises 


1. Show that a non-zero ideal in a primitive ring is again primitive (as a non-unital 
algebra). 

2. Let A be an algebra without nilpotent (two-sided) non-zero ideal. Show that A has 
no nilpotent non-zero left or right ideals. Deduce that for any non-zero idem- 
potent e in A, eA is a minimal right ideal iff eAe is a skew field. 

3. Let A be a nil algebra (i.e. every element is nilpotent). Show that every maximal 
right ideal is two-sided and contains A-. 

4. Let A be an algebra over a field k; show that if A has no zero-divisors, then A is an 
integral domain iff 0 is not modular, as right ideal. By considering End(A), show 
that A can then be embedded in an integral domain. 

5. Let k bea field and A the k-subalgebra of the free algebra k(x, v) consisting of all 
polynomials with zero constant term. Find two modular right ideals of A whose 
intersection is not modular. 

6. Show that if a A x I sandwich matrix is row-finite, then || < |A]. 

. Let A be a Rees matrix algebra over a skew field K with A x I sandwich matrix P. 
Show that |A| < 2"! and give an example where equality occurs. (Hint. If V is a 
K-space of dimension |J |, then its dual V* has dimension 2'"'; interpret the rows 
of P as vectors in V*.) 


“I 


8.5 Semiprime rings and nilradicals 


We have already briefly met prime and semiprime rings and now come to examine 


8.5 Semiprime rings and nilradicals 329 


the relation between them. We recall that a prime ring is a ring R # 0 such that for 
any two ideals a, b of R, 


ab=O>a=0 or b=0. 


Equivalently, for any a, b € R, aRb = 0 implies a = 0 or b = 0. In the commutative 
case a prime ring is just an integral domain. In general every integral domain is 
prime, but not conversely. More generally we have 


Proposition 8.5.1. Every primitive ring is prime. 


Proof. Let R be primitive and take a simple faithful right R-module M. If a, 6 are 
non-zero ideals in R, then Ma 4 0, hence Ma = M by the simplicity of M; likewise 
Mb = M, so Mab = Mb = M. This shows that ab + 0. 

Since the notion ‘prime’ is left-right symmetric, a similar result holds for left 
primitive rings. Of course the converse of Proposition 8.5.1 is false, as we see already 
in the commutative case, when primitive rings are fields, whereas prime rings are 
integral domains. 

By a prime ideal in a ring R we understand a two-sided ideal p such that R/p is 
a prime ring; for commutative rings this reduces to the definition given in BA, 
Section 10.2. We observe that a prime ideal must always be proper. 

There is a method of constructing prime ideals, rather as in the commutative case; 
the notion of multiplicative set has to be replaced here by that of an m-system. By 
this term one understands a subset M of R such that 1 € M and if a,b € M then 
axb € M for some x € R. We note that an ideal is prime iff its complement in R 
is an mm-system. Now there is an analogue of BA, Theorem 10.2.6: 


Theorem 8.5.2. Let R be a ring, M an m-system in R and a an ideal in R disjoint from 
M. Then there exists an ideal p in R which contains a, is disjoint from M and is 
maximal with respect to these properties. Any such ideal p is prime. 


Proof. Let .</ be the set of all ideals a’ of R such that a’ Da, a’ MM = @. Then 
aé.¥, so -¥ is not empty. It is easily seen to be inductive and so by Zorn’s 
lemma, it contains a maximal member p, which is an ideal with the required 
properties. 

Now let p be an ideal with the properties stated and assume that a, b D p, ab C p. 
Then a+p,6+p are strictly larger than p and so must meet M, say s=p+a, 
t=q+beM, where p,qép, a€a, be hb. By hypothesis there exists x € R such 
that sxt € M; thus M contains 


(p+ a)x(q +b) = px(q+b)+axq+axbep+t+ab=p. 


a contradiction. Hence a.b ¢ p implies ab ¢ p and clearly 1 ¢ p, so p is indeed 
prime. Ea 


In the commutative case we found that the intersection of all prime ideals of R is 
the set of all nilpotent elements of R. Theorem 8.5.2 can be used to obtain a corre- 
sponding result for general rings; however, the situation is a little more complicated 


330 Rings without finiteness assumption 


here because the ideal generated by a nilpotent element need not be nilpotent. Let us 
recall that an ideal a is called nilpotent if a’ = 0 for some r > 1, Le. x) ...x, = 0 for 
any x; € a, and a is nil if it consists of nilpotent elements. Clearly every nilpotent 
ideal is nil, but the converse does not hold generally. 

We also recall that a ring R is called semiprime if for any two-sided ideal a of R, 


a -0>a=0. (8.5.1) 


In terms of elements this states that aRa = 0 implies a = 0, for all a € R. We observe 
that a ring is semiprime iff it has no non-zero nilpotent ideals. For when this 
holds, (8.5.1) is clearly satisfied. Conversely, assume (8.5.1) and let a be a nilpotent 
ideal, say a’ ' #0, a =0, where r>2. Then 2r ~2>r, hence 0=a’~° = 
(a’- 1)- # 0, by (8.5.1), which is a contradiction. 

In the commutative case the semiprime rings are the reduced rings, 1.e. rings in 
which 0 is the only nilpotent element; in the general case every reduced ring is semi- 
prime, but not conversely. 

An ideal a in a ring R is called semiprime if R/a is a semiprime ring. We note that R 
itself, as an ideal of R, is semiprime but not prime. 

We first elucidate the relation between prime and semiprime ideals. 


Proposition 8.5.3. Let R be a ring. Then every intersection of prime ideals 1s semiprime; 
conversely every’ semiprime ideal is an intersection of prime ideals. 


Proof. Let c = Mp,, where the p, are prime ideals, and suppose that aRa C c. Then 
aRa Cp,, hence a € p, for all A, and so a€ Mp, =c, and this shows c to be semi- 
prime. Conversely, let c be a semiprime ideal in R; by passing to the residue class ring 
R/c, we may take c = 0. We have to show that in a semiprime ring the intersection of 
all prime ideals is 0. Take any a € R; then aRa #0, so there exists bo such that 
a, = abya #0. Generally, if we have constructed a,...., a, such that a,., € a,Ra, 
and a, 40, then there exists b, such that a,,,; =4@,b,a, #0. The set 
M = {1l. a) =a. a,,ax,...} is an m-system, for given a,.a;, choose n > r,s; then 
a,, a; are factors of a, and a,,b,,a,, #4 0, hence a,ua, #4 0 for some u € R. Thus M is 
an mi-system and ae M, 0 ¢ M; hence we can find a prime ideal p disjoint from 
M, by Theorem 8.5.2. It follows that a ¢ p; since a was any non-zero element of R, 
we see that the intersection of all prime ideals of R is zero. o 


Corollary 8.5.4. A ring is semiprime if and only if the intersection of all its prime ideals 
1s zero. Hence any semiprime ring R can be written as a subdirect product of prime rings, 
which are homomorphic images of R. In particular, every semiprimitive ring is semi- 
prime. 


Proof. The first part follows by applying Proposition 8.5.3 to the zero ideal, and the 
second part now follows as in the proof of Theorem 8.3.1. bi 


Just as semiprimitivity leads to the Jacobson radical, so there is a type of radical 
arising from semiprimeness, but it is not so clear cut and in general there is more 
than one radical. Let us define a nilradical in a ring R as an ideal N which is nil 


8.5 Semiprime rings and nilradicals 331 


and such that R/N is semiprime. A ring may have more than one nilradical; in what 
follows we shall describe the greatest and least such radical. We begin with a lemma. 


Lemma 8.5.5. The sum of any family of nil ideals in a ring is a nil ideal. 


Proof. Consider first two nil ideals a;. a, and write a = a, + a>. Then the ideal a/a, 
of the ring R/a; is nil, because a/a, = a; /(a, Ma») and the latter is a homomorphic 
image of a, and hence is nil. Thus any element of a has a power in a» and a power of 
this is 0, therefore a is nil. Now an induction shows that the sum of any finite 
number of nil ideals is nil. In the general case let a = 5° a,, where each a; is a nil 
ideal. Any element of a lies in the sum of a finite number of the a, and so is nil- 
potent, therefore a is nil. Ea 


By this lemma the sum of all nil ideals in a ring R is a nil ideal; it is denoted by 
U = U(R) and is called the (Baer) upper nil radical or also the Kothe nilradical. It is 
indeed a nilradical, for R/U cannot have any non-zero nil ideals, by the maximality of 
U; a fortiori U must be semiprime. Since U contains all nil ideals of R, it contains all 
nil radicals. 

To obtain the least nilradical we need another definition. An element c of R is 
called strongly nilpotent if any sequence c; = c.¢3.¢3,... such that ¢,. ) € ¢,Re, is 
ultimately zero. It is clear that such an element is nilpotent and that any element 
of a nilpotent (left or right) ideal is strongly nilpotent. Moreover, in a semiprime 
ring R, the only strongly nilpotent element is 0. For if c 0, then cRce # 0, say 
© = cac £0, now c; = ca'c. # 0 for some a’ € R and continuing in this way we 
obtain a sequence c) = ¢. 2. ¢3.... such that ¢,.) € ¢,Rc, and no c, is zero, so c¢ 
is not strongly nilpotent. Conversely, if 0 is the only strongly nilpotent element in 
R, then R is semiprime. For if not, then cRe = 0 for some c ¥ 0, so ¢ is strongly nil- 
potent. Thus we have 


Proposition 8.5.6. Any ring R is semiprime if and only if the only strongly nilpotent 
element is 0. fal 


Now the least nilradical may be described as follows: 


Theorem 8.5.7. In any ring R the set L(R) of all strongly nilpotent elements 1s the least 
nilradical, and 1s equal to the intersection of all prime ideals in R: 


[L(R) = {p|p prime in R}. (8.5.2) 


Proof. If x is strongly nilpotent in R, so is its residue class x (mod p) for any prime 
ideal p, so x = 0, which means that x € p. Thus L(R) is contained in the right-hand 
side of (8.5.2). Now suppose that c ¢ L(R); then c is not strongly nilpotent, so there 
exists a sequence {c,,} such that c, = c,0 #4, 2) € ¢,Re,. The set S = {1, cc), ,...} 1s 
an m-system, for given c,,c,, where r <s say, we have c.) € ¢-Re, Vc¢;Re,. By 
Theorem 8.5.2 there is an ideal p which is maximal disjoint from S, and p is 
prime, thus c ¢ p and this shows that equality holds in (8.5.2), and incidentally, 
that L(R) is indeed an ideal. 


332 Rings without finiteness assumption 


Now L(R) is clearly a nil ideal and R/L(R) is semiprime, hence L(R) is a nilradical, 
and by (8.5.2) it is the least ideal with semiprime quotient, hence it is the least nil- 
radical. g 


L(R) is called the prime radical or the (Baer) lower nilradical. 

If a ring R has a non-zero nilpotent right ideal a, then Ra is a two-sided ideal, 
again nilpotent, since (Ra)” C Ra". The following conjecture, raised in the 1930s, 
is still unanswered: 


Kothe’s conjecture. If a ring has a non-zero nil right ideal, then it has a non-zero nil 
ideal. 


Equivalently, the prime radical contains every nil right ideal. We remark that in 
Noetherian rings every nil right ideal is nilpotent (Proposition 7.4.3), so the conjec- 
ture is valid in that case. 

In general rings the upper and lower radical may wel] be distinct; to give an 
example it is enough to construct a semiprime ring with a non-zero nil ideal. Let 
A be the (non-unital) k-algebra generated by x. x2.... with the following defining 
relations: for any element a involving only x;,%2,...,%, we have a” =0. Then 
every element of A is nilpotent, and in the augmented algebra R = A’ we have the 
nil ideal A. However, R is semiprime, for, given a € R“, let a be of degree m in 
the x’s and put N = 2m-+ 2. Any relation involving x, consists of terms of degree 
at least N, hence axya # 0, because this expression has only terms of degree at 
most 2m-+ 1. This shows R to be semiprime. 

For another example, this time finitely generated, take the finitely generated non- 
nilpotent nil algebra A constructed by Golod (see BA, Exercise 5 of Section 6.3). It 
can be verified that A’ is semiprime but it has the nil ideal A. More generally, a 
simple nil algebra has recently been constructed by Agata Smoktunowicz [2002]. 

However, in a Noetherian ring the upper and lower nilradicals coincide. For as we 
saw in Proposition 7.4.3, in a semiprime ring with maximum condition on right 
annihilators every nil left or right ideal is 0. Hence if R is right Noetherian, then 
R/L(R) has no non-zero nil ideals and so U(R) = L(R); by symmetry the same 
holds for left Noetherian rings, so we have 


Proposition 8.5.8. In a right (or left) Noetherian ring the upper and lower nilradical 
coincide. Thus in a Noetherian ring the prittie radical is a nilpotent ideal containing all 
nilpotent ideals. Bl 


There is a radical intermediate between U and L which is sometimes considered; 
it is defined as follows. An algebra A is said to be locally nilpotent if the subalgebra 
generated by any finite subset is nilpotent. It is clear that for any ideal we have the 
implications: nilpotent = locally nilpotent => nil. Moreover, it is not hard to show 
that the sum of any number of locally nilpotent ideals is again locally nilpotent. Now 
the Levitzki radical N of a ring R is defined as the sum of all locally nilpotent ideals 
of R. By what has been said, N is locally nilpotent, hence nil and it can be shown that 


8.5 Semiprime rings and nilradicals 333 


R/N has zero Levitzki radical, so it is semiprime, and this shows N to be indeed a 
nilradical. In any ring we have 


UN 21. 


and in the first example given above N 5 L, at least in characteristic 0, by the 
Nagata-Higman theorem (see Further Exercise 15 of Chapter 7), while U D> N in 
Golod’s example. 


Exercises 


l. 


iW TV 


13. 


a 
. Let R be the subring of %t,(Z) consisting of all matrices ( 


Show that in any right Noetherian ring 0 can be expressed as a product of prime 
ideals. 


. Show that a prime ring with a minimal right ideal is primitive. 
. Show that in a semiprime ring the socle and the left socle coincide. (Hint. Use 


Exercise 2 of Section 8.4.) Give an example to show that this does not hold 
generally. 


Cc 


b 
such that 
d 


a = d, b =c (mod 2). Show that R is a prime ring, but not an integral domain; 
find all its idempotents. 


. Give an example of a commutative ring with a nil ideal which is not nilpotent. 
. Show that the sum of any family of locally nilpotent ideals is locally nilpotent. 


(Hint. Imitate the proof of Lemma 8.5.5.) 


. In any ring with Levitzki radical N, verity that R/N has zero Levitzki radical. 
. In any ring R, define N, as the sum of all nilpotent ideals and define N, for any 


ordinal @ recursively by N,.}/Ng =N\(R/N.), while at a limit ordinal, 
Noa = Us-uNu. Show that the union of all the N,, is the lower nilradical of R. 


. Show that in a left or right Noetherian ring any nilpotent element is strongly nil- 


potent. Deduce that every nil (left or right) ideal is nilpotent. 


. Show that a reduced prime ring is an integral domain. 
. Show that any ideal of a semiprime ring is semiprime, qua non-unital algebra. 
. Show that in a reduced ring R, if a product of n elements in a certain order is 0, 


then the product in any order is 0. (Hint. Show that if a)...a, = 0, then 
A)X| Axx. ...a,xX, = 0 for all x; € R.) 

Show that in any ring R, every prime ideal contains a minimal prime ideal. 
(Hint. Take a maximal m-system containing the complement of the given 
prime ideal.) 


. Let R be a reduced ring and p a minimal prime ideal. Show that the monoid M 


generated by the complement of p does not contain 0. Deduce that M does not 
meet p and hence that R/p is an integral domain. 


. Show that every reduced ring is a subdirect product of integral domains. (Hint. 


Use Exercises 10-14. This is a theorem of Andrunakievich—Ryabukhin; the proof 
is due to Herstein.) 


334 Rings without finiteness assumption 


8.6 Prime Pl-algebras 


In some respects the presence of a polynomial identity has effects similar to the 
maximum condition and we shall in this section prove Posner’s theorem, which is 
an analogue of Goldie’s theorem for PlI-rings. The first step is to form fractions 
with respect to a central Ore set. 


Theorem 8.6.1. Let R be a semtiprimitive Pl-algebra with centre C. Then every non-zero 
ideal a of R meets C non-trivially: aN C # 0. 


Proof. By hypothesis R has a family of primitive ideals t;, (A € A) whose intersection 
is 0. Let f =O be a polynomial identity for R; then R, = R/t, is a primitive PI- 
algebra satisfying f = 0 and we have an embedding 


R —_ I] R,, 
by Theorem 8.3.1. If deg f = d, then by Theorem 8.3.6, R; is simple of degree at 


most d/2 over its centre. The canonical homomorphism ¢, : R —~ R, is surjective 
and R; contains ag, as an ideal, which is therefore 0 or R,, by simplicity. Put 


a es {A e Alae; = a O}. 


and choose yz € A such that the degree m of R,, is maximal. Then the Razmyslov 
polynomial D = D,, is central and non-zero on R,, and since ag,, = R,,, there exist 


Gia 64,0ne asuch that © Dab ica es a,€,,) is a non-zero element of the centre 
of R,,. It follows that = D(a,,....a,) # 0; moreover, ve; = 0 for 4 ¢ Ag, while 
for A € Av, yé;, is in the centre of R;; therefore ye aNC. a 


Corollary 8.6.2. Any semiprimitive Pl-algebra whose centre is a field is simple. 


Proof. For if a 4 0, then aN C $0, so a contains a unit and soa=R. Ei 


Here none of the conditions can be omitted, for the polynomial ring k{x] is a 
semiprimitive PI-algebra which is not simple, and for an infinite-dimensional k- 
space V, End;(V) is a primitive k-algebra whose centre is a field, but which is not 
simple (and of course not a Pl-algebra). 

Theorem 8.6.] has a useful generalization. To prove it we first note a PI-analogue 
of Proposition 7.4.3. 


Proposition 8.6.3. Any prime Pl-algebra A satisfies the maximum condition on left (or 
right) annihilators; hence every nil left or right ideal of A is zero. 


Proof. Let 0 Cl, Cl C... bea strictly ascending chain of left annihilators and put 
t, = (L,),, so that t)} Dt, D.... By Proposition 7.5.3 we may take the polynomial 
identity to be multilinear; denote by d the least degree for which there is a homo- 
geneous multilinear polynomial f of degree d: 


f = ) gX|a---AXdo-. 


8.6 Prime Pi-algebras 335 


such that f vanishes for all x €[,(i=1,....,d). Here d > 1, for if a,l; = 0, then 
|| = 0, because A is prime. We thus have 


3 GX ooo 4 S90) Torallex esl, (8.6.1) 


Now when do + d, then xj € [y_; and the corresponding term in (8.6.1) is 0; the 
remaining terms have the form 4,x),.. -X(4..\)aXdta - 1 Hence 


»: oXtg .«% eC ee abd = 1 = 0: (8.6.2) 


where o ranges over all permutations of 1..... d — 1. But lyty— ) is a non-zero two- 
sided ideal and A is prime, so 


YS ae Xio XG Vo = 0 for x, Eb. 


and this contradicts the definition of d. Hence the chain of [; breaks off and the 
maximum condition holds. By symmetry the same is true for right annihilators 
and now the rest follows by Proposition 7.4.3. Ea 


Corollary 8.6.4. In a semiprime Pl-algebra every nil left or right ideal 1s zero. 


Proof. If R is semiprime, it can be embedded in []R,, where R, = R/c, is a prime 
Pl-algebra, Mc, = 0. If n is a nil ideal in R and ¢, : R > R, is the canonical projec- 
tion, then ne, is a nil ideal in R;, hence equal to 0 by Proposition 8.6.3, and so 
nC Mkere, = 0. 

We deduce 


Theorem 8.6.5 (Rowen). In a semiprime Pl-algebra, every non-zero ideal meets the 
centre nontrivially. In particular, a semiprime Pl-algebra whose centre 1s a field 1s 
simple. 


Proof. Let R be a semiprime PI-algebra with centre C. By Corollary 8.6.4, R has no 
non-zero nil ideals, hence by Proposition 8.3.5, the polynomial ring R[f| is semi- 
primitive, and its centre is clearly C[t], so by Theorem 8.6.1, for any non-zero 
ideal a in R we have alt} C([t] #0. On comparing coefficients we find that 
aMC 40. Now the rest follows as in Corollary 8.6.2. a 


It is now an easy matter to deduce Edward Posner’s theorem, in a rather strong 
form, following Louis Rowen, 1973 (see Rowen (1980)): 


Theorem 8.6.6 (Posner, 1960). Let R be a prime Pl-algebra with centre C. Then C is 
an integral domain, and if K 1s its field of fractions, then the natural mapping 
R—> Q=R®@_ K 1s an embedding and Q is a finite-dimensional simple K-algebra. 


Proof. The first part follows by Corollary 7.1.11; now any multilinear identity in R 
also holds in Q, so by Theorem 8.6.5, Q is simple. Hence by Kaplansky’s theorem 
(Theorem 8.3.6), Q is finite-dimensional over K. = | 


336 Rings without finiteness assumption 


Let R be a prime Pl-algebra and Q its quotient ring by fractions of the centre. By 
Theorem 8.6.6, Q is of finite dimension over its centre and this dimension is a 
square, n- say (Proposition 5.2.2); the number #1 is called the Pl-degree of R. We 
can now determine the PlI-degree of the generic division algebra: 


Corollary 8.6.7. The generic division algebra of degree n has PlI-degree n. 


Proof. The generic matrix ring F,,,, is a prime Pl-algebra, hence Theorem 8.6.6 
applies, and its skew field of fractions D is a finite-dimensional central simple 
algebra. Moreover, F;,, satisfies the standard identity S:,, but no identity of lower 
degree, hence the same is true of D. It follows that on passing to a splitting field 
we obtain an mn X m matrix ring, hence D is of degree over its centre. fe 


The PlI-degree has been used to give a simple characterization of separable 
algebras, by Michael Artin and Claudio Procesi. 
For any ring R with centre C, the mapping 


Yip. xtoaxb (a,be R) 


is an endomorphism of R as C-module and it is easily checked that the correspon- 
dence a ® bi> », , defines a homomorphism 


21: R° @, R—- End, (R). (8.6.3) 


If R is a finitely generated projective C-module and (8.6.3) is an isomorphism, R is 
called an Azumaya algebra or also central separable (it can be shown that this is 
equivalent to R being separable as C-algebra, see Section 4.7). The proof of the 
Artin—Procesi theorem given below uses the Razmyslov polynomial D. We recall 
that D=D,, is multilinear alternating in m- variables x,,...,x,° (and others 
which we shall not name explicitly). If we put D,\(x) = Dy(x.x).....%) -). 
Mi dvig as a x,>), then the expression x)D,, — >— x,.D,,,.(xo) is an alternating multilinear 
function of the n- + 1 variables xy. ....x,: and so vanishes on any prime ring of PI- 
degree n, because such a ring can be embedded in its quotient ring which is spanned 
by n° elements over C. 


Theorem 8.6.8 (Artin-Procesi). Let R be a prime Pl-algebra of Pl-degree n. Then R ts 
Azumaya provided that each simple homomorphic image of R has PlI-degree n. 


Proof. (W. Schelter) Consider the Razmyslov polynomial D,,; if its arguments lie in R, 
its values will be in the centre C of R. If they all lie in some maximal ideal of R, then 
D,, will vanish on some homomorphic image of R, which therefore has PI-degree less 
than #, against the hypothesis. Hence the left ideal generated by the values must be 
the whole of R: 


l= >» b,D,,(a.,) where a,b; € R. (8.6.4) 


and the other arguments of D,,,. also lie in R. Denote by D,,,., the function obtained 
from D,,, (defined above) by replacing x,, by a,,; for u # v. By the remark preceding 


8.7 Firs and semifirs 337 


the theorem, we have cD,,.(a,.;) — S° a, ;D,, ;(c) = 0 for any c € R; hence by (8.6.4), 


C2) ViDiGal = > Cae Dine). 


Thus we have a (finite) projective coordinate system for R, showing that R is finitely 
generated projective over C. 

It remains to show that (8.6.3) is an isomorphism. We note that End (R) is 
generated by elements of the form cD,,., and if D,,;(x) = do jt.ijxv.i;, then 
(So ct, @ Vi JA = cDyyj, $0 A is surjective. Now 


> Pi @ 9) = D_ Pj @ diay, Dyyslqi) 


ypue 


= >» pita) & bia, 


- | Smads 7 © bay. 


Var 


If }° p, @q, € ker A, then )¢ p.xq; = 0 for all x € R, hence * p; ® gq; = 0 by (8.6.5); 


this shows A to be injective, so it is indeed an isomorphism. a] 


It can be shown that the sufficient condition of this theorem is also necessary (see 
Rowen (1980), (1988)). 


Exercises 


1. Let D be a skew field with centre k. Show that for any (skew) subfield E of D, 
E and Ek have the same PlI-degree. 

2. Show that in any Pl-algebra the upper and lower nilradicals coincide. 

3. (Amitsur) Let R be a Pl-algebra and St the sum of all its nil ideals. Use Corollary 
8.3.8 to show that R/N can be embedded in a matrix ring Wt,,(E), where EF is a 
commutative ring. Deduce that R satisfies an identity S43’, = 0, where S2,, is the 
standard polynomial and m > 1. (By Exercise 1 of Section 7.7, m cannot always 
be taken to be 1. Hint. Express R as homomorphic image of a relatively free 
algebra satisfying a given polynomial identity holding in R.) 


8.7 Firs and semitirs 


Principal ideal domains (PIDs) include several important types of commutative ring 
such as the ring of integers, polynomial rings in one variable over a field and certain 
rings of algebraic integers, more specifically, any Dedekind domain with unique 
factorization (see BA, Section 10.5). In the non-commutative case there are no 
such striking examples of PIDs, but there is a wider class of rings which reduce to 
PIDs when commutativity is imposed. 


338 Rings without finiteness assumption 


Definition. By a free right ideal ring or right fir for short, we understand a ring R 
in which each right ideal is free, of uniquely determined rank. Left firs are defined 
similarly and a left and right fir is called a fir. 

It follows that each right (or left) fir R has invariant basis number (IBN), for this is 
so when R is right Noetherian, and when this is not the case, it will contain free right 
ideals of arbitrary rank, by Proposition 7.1.9. We shall meet firs again in Section 11.5, 
where it will be shown that the free algebra k(X) on any set X over a field k is a fir; 
this may be regarded as a generalization of the fact that the polynomial ring A[x} ina 
single variable over a field is a PID. 

Often one meets an even wider class than firs; to describe it we shall need to look 
at general relations in rings. A relation 


yy) +... AX, = 0 (8.7.1) 


or in terms of vectors, x.y = 0, in a ring R is said to be trivial if for each i = 1, .... 1, 
either x, = 0 or y, = 0. Every non-zero ring has non-trivial relations, for example, if 
x = (1.1), y=(— 1.1)’, then we have the non-trivial relation 


xy=(l 1)( | ) =o. (8.7.2) 


However, this relation can be transformed into a trivial relation by replacing x, y by 
x’, y', given by 


eT) \) (2 Oy. 3! ( ') | 8 (8.7.3) 
. S72 a — . of. 
(, a) oe a ( l 


Let us write the relation (8.7.1) as x.y=0, where x=(xX)....,: ls 
yv=(y....-¥n)/. Then (8.7.1) is said to be trivializable if there is an invertible 
matrix P over R such that the relation xP~'.Py = 0 is trivial; we also say that 
(8.7.1) is trivialized by P. More generally, a matrix relation XY = 0, where X 1s 
rxnand Yisn xs, 1s trivial if for eachi=1..... n, either the 1-th column of X 
or the i-th row of Y is 0, and XY = 0 is trivializable if XP~ '.PY = 0 is trivial for 
some P € GL,(R). In the above example, (8.7.3) shows the relation (8.7.2) to be 
trivializable. 
We begin by describing the rings in which every relation is trivializable: 


Theorem 8.7.1. Let R be a non-zero ring. Then the following conditions are equivalent: 


(a) Every relation in R can be trivialized. 

(b) Every finitely generated right ideal in R is free, as right R-module, of unique rank. 
(c) R has IBN and every finitely generated submodule of a free right R-module is free. 
(d) Every matrix relation in R can be an ialized. 

(a”)-(d") the leftright analogues of (a)—( 

Further, any such ring 1s an integral domain. 


Proof. (a) => (b). Let a be a finitely generated right ideal of R and let n be the least 
integer such that a has an n-element generating set, 4)..... u,, say. Then a is free on 


8.7 Firs and semifirs 339 


“,..... Uy, for if not, take a non-trivial relation u.a = 0. By (a) this can be trivialized, 
say u’ = uP~', a’ = Pa. Since a # 0, we have a’ # 0, say a’, # 0. But u’.a’ = 0 is 
trivial, so u), = 0 and it follows that a is generated by u,,....4,,_ ,, which contra- 
dicts the choice of mn; hence a is free on u)..... u,,. If a has another basis 1,,.... ae 
and m # n, then m > n and R” = R"; this yields an endomorphism f of a which is 
surjective but not injective. Thus fu)... .. fu, generate a but not freely; by the first 
part we see that a can be generated by fewer than # elements, which is again a contra- 
diction, hence m = n and a has unique rank. 

(b) = (c). Let F be a free right R-module and G a finitely generated submodule. 
The finite generating set of F involves only finitely many generators of F; by ignoring 
the rest we may take F to be finitely generated. Let A; be the projection of F on the 
first factor R and denote by F’ the kernel of 4. Then we have the exact sequence 


0O>- FANG>~Gpma—Qd, (8.7.4) 


where a is the image of G under 4), a finitely generated right ideal of R. By (b), a is 
free, hence (8.7.2) splits and G &=(F’'MG) @a. By induction on the rank of F, 
F’™ Gas finitely generated submodule of F’ is free and it follows that G is free. 

We next show that R is an integral domain. Let a € R and denote its right anni- 
hilator by n. Then aR = R/n; since aR is free, we have R=n@a, where a & aR. 
Both n and a are free; by the uniqueness of the rank, either a = 0 or n=0, so a 
is either 0 or right regular. This holds for all ae R, hence R is an integral 
domain. Now if R is right Ore, then it is embeddable in a skew field and so has 
IBN (BA, Theorem 4.6.7); otherwise by Proposition 7.1.9 it has free right ideals of 
any finite rank and this rank is unique by (b); hence R has IBN. 

(c) => (d). Let XY = 0 bea matrix relation, where X € "R". Y € "R*. The matrix X 
defines a linear map y:"R-—>'R by left multiplication and we have an exact 
sequence 


0O— kerry "R > img — 0. (8.7.5) 


As a finitely generated submodule of ’R, im g is free, so the exact sequence (8.7.5) 
splits, and by changing the basis in "R we obtain a basis adapted to the submodule 
ker py: "R = ker y @ F, where F = im y, dim ker g = 1, say. If this change of basis is 
described by P € GL, (R), we put X' = XP~', Y' = PY. Since X'Y’ = XY = 0, the 
columns of Y’ lie in ker g; thus the rows of Y’ after the first t are 0, while the 
first t columns of X’ are 0. Hence the matrix relation XY" = 0 is indeed trivial. 
Now (a) is a special case of (d) and (a")-(d") follow by the evident symmetry 


of (a). + | 


A non-zero ring satisfying the conditions of this theorem is called a semufir. By (b) 
every left or right fir is a semifir; however there are right firs that are not left firs (see 
Exercise 5). We also remark that a commutative semifir is just a Bezout domain, 
while a commutative fir is a PID. 

As already remarked, the free algebra k(X) on any set X over a field k is a fir; more 
generally, this holds for the tensor D-ring D,(X), where D is any skew field and k a 
subfield. This is proved by means of the weak algorithm, which we shall not define 


340 Rings without finiteness assumption 


here (see Cohn (1985), Chapter 2); it is easier to prove the weaker statement that 
D,(X) is a semifir and we shall do so now (but see also Section 11.5). 


Theorem 8.7.2. Let D be a skew field and F a subfield. Then the tensor D-ring Dx (X) 
on any set X is a semifir. 


Proof. Let {1,.} be a right F-basis of D and X = {x;}; then every element of D; (X) can 
be uniquely written in the form c+ >°u,x,f., where c € D and f,, € D;(X). This 
follows because we can express every element of D as a = ) > u,a,, where a, € F 
ada = a = as 
Suppose now that we have a relation in D, (X): 
tl 
ee (8.7.6) 

] 
In order to show that this relation can be trivialized it is enough to do this in a given 
degree; thus we can assume that the a,,b; are homogeneous and that deg a, + 
deg b, = r > 0. We shall use double induction, on n and r. If each a, has positive 
degree, we can write a, = 9° u.X,4,4;3 equating cofactors of u,x, we find 


> a, a: = 0. 


and now the result follows by induction on r, for we can make a transformation 
reducing one of the b’s to 0. There remains the case where some aj, say a, has 
degree 0. Then we can replace a. by a.—a).a, 'a, =0 and b, by bi) =b, + 
aya; 'b; we thus obtain 


> ayb; = ab, + 43b3 +... + a,b, = 0. 


and we have now diminished n, and so can apply induction on » to complete the 
proof, Oo 


We conclude this section with a result that is often useful, the inertia lemma for 
semifirs (see also Cohn (1985), Lemma 4.6.3). We recall that from any ring R we can 
form the polynomial ring R[t] in a central indeterminate f; its completion by power 
series 1s the formal power series ring R[[t]], and if we localize now at the powers of t 
we obtain the formal Laurent series ring R((t)). Since t is regular in R[[t]], the latter 
ring is embedded in the Laurent series ring. 

Let B be any ring and A a subring. Then A is said to be finitely inert in B if for any 
matrix Z €'A*, if Z= XY over B, where X is rx n and Y is n x 5s, there exists 
P €GL,({B) such that XP~!, PY have their entries in A. 


Lemma 8.7.3 (Inertia lemma). Let R be a semifir. Then for any central indeterminate 
t, the formal power series ring R{[t]] is finitely inert in R((t)). 


Proof. Put S = R[[t]]. T = R((t)) and indicate the natural homomorphism S$ — R 
by f > (f Jy; 1t amounts to putting t = 0. We take A € 'S* and suppose that over T: 


A= PQ. where Pisrxnand Qisnxs. (8.7.7) 


8.7 Firs and semifirs 344 


If P or Q is 0, there is nothing to prove, so we may suppose P, Q # 0. Over T every 
non-zero matrix C can be written in the form t'C’, where v € Z, C’ has entries in S 
and (C’), #0. Let P=t"P’, Q=t'Q’, where (P’),, (Q’)o #0. Dropping the 
dashes and writing j4 + v = —A, we can rewrite (8.7.7) as 


A=t *PQ. where P€'S", Qe "S* and (P),. (Q)y £ 0. (8.7.8) 


If A < 0, there is nothing to prove, so assume that A > 0. Then (PQ), = 0; since R is 
a semifir, we can find a matrix U € GL,,(R) trivializing this relation, and on replacing 
P, Q by PU~', UQ we find that for some ft (0 < fh < n) all the columns in (P), after 
the first are 0, while the first h rows of (Q)py are 0. If we multiply P on the right by 
V = tl, @I,-,, and Q on the left by V~', then P becomes divisible by t while Q still 
has all its entries in S. In this way we can, by cancelling a factor t, replace t ~* by t'~* 
in (8.7.8) and after A steps obtain the same equation with A = 0. This proves the 
finite inertia. a 


We remark that there is a stronger notion, total inertia, and the inertia theorem 
asserts that an inversely filtered fir with inverse weak algorithm is totally inert in 
its completion (see Cohn (1985), Theorem 2.9.15). 


Exercises 


. Show that a right Ore domain is a right fir iff it is right principal. 

. Show that every semifir is weakly finite. 

. Let R be a semifir and A, B finitely generated submodules of a free R-module. 
Show that AMB, A+B are again free and that rk(A+ B)+rk(AN B) = 
rk A+ rk B. 

4. Let R be a right fir. Show that any submodule of a free right R-module is free. 

5. In the group algebra over a field k of the free group on x, y let R be the subalgebra 

generated by x,yox7 yx "ye... Verify that R is a semifir but not a left fir (it 

can be shown that R is a right fir, see Cohn (1985), Section 2.10). 


WwW tO 


Further exercises on Chapter 8 


1. Show that if V is a vector space of infinite dimension v over a skew field K of 
cardinal a, then V has the cardinal vw and V* = Homy(V.K) has cardinal 
a’. Show that if K is commutative and dim V*=v*, then v* >a, and 
deduce that v* = @" (this is true even when K is skew, see Jacobson (1953)). 

. Let K be a skew field and for n > 1 embed 3%.(K ) in Vi....(K) by mapping A 
to A GA. Show that the direct limit of the rings 3..(K ) is simple. What are its 
projective modules? 

3. Let K be a skew field. Describe the ideal structure of the following infinite matrix 

rings over K: (1) the ring of all row-finite and column-finite matrices, (11) the 
ring of all matrices which are equal to a scalar outside a finite square. 


tw 


342 


4, 


16. 


Rings without finiteness assumption 


Let V be a (K. R)-bimodule, where K is a skew field and R is a ring which acts 
fully on V with centralizer K. Show that V, is simple; if moreover R acts fully on 
“V, then R acts densely on V. 


. (Jacobson) Let k be a field and R the (unital) k-algebra generated by u, 1 subject 


to uv = 1. Show that R is primitive by representing it by the linear transforma- 
tions é, = e,41,ev=e .) if1>1,ev=—0. 

(P. Samuel) Let k be a field and k(x, y) the free k-algebra on x, y. Show that this is 
primitive by representing it as follows: ex = e,..), ey = e;-) 1fi > lreyy = 0. 


. Show that in any ring R the ideals with primitive quotient are just the cores of 


maximal right ideals. Show further that the intersection of these ideals is J(R). 
Show that in a prime ring R every non-zero right ideal is a faithful R-module. 


. Show that every maximal ideal in a ring 1s a prime ideal. 
. Show that the least ideal % in a ring R such that R/St is semiprime is a nil ideal 


and deduce that St is the lower nilradical. 


. Find the socle of the ring of all upper triangular matrices over a field, and show 


that it may differ from the left socle. 


. Let K be a commutative ring and ¢ an indeterminate. Show that if aj + ayt+ 


...$¢a,t" € K[t] isa unit, then aj isa unit and a,....,. a,, are nilpotent. Deduce 
that the Jacobson radical of K|t] is its nilradical. 


. Show that the left (or right) ideal generated by a strongly nilpotent element is 


nilpotent. 


. Let R be a prime ring. Show that any non-zero central element of R is a non- 


zero-divisor; deduce that the centre of R is an integral domain. 


. (Kaplansky) With any square matrix A over a ring K we can associate an ‘infinite 


periodic’ matrix by taking the diagonal sum of countably many copies of A. Let 
R be the ring of all upper triangular infinite periodic matrices over a field k. 
Show that R is prime, with Levitzki nilradical equal to the Jacobson radical. 
Let R be any K-algebra and define the centroid of R as End(rRx). Show that the 
centroid I’ is a commutative K-algebra and that R has a natural [’-algebra struc- 
ture. If R is primitive and R~- = R, show that I" is an integral domain. 


. Show that the group algebra of the additive group of rationals is a Bezout 


domain. For real numbers q@. 6 show that the monoid M of all positive numbers 
of the form ma + nf is such that the monoid algebra kM is Bezout iff a/B is 
rational. 


. Over a semifir R show that if a matrix product AB(A €'R",Be"R*) hasau x v 


block of zeros, then for some t (0 < t < n) anda suitable P € GL,,(R), AP~' has 
au x t block of zeros and PB has an n — t x v block of zeros (this is the partition 
lemma, see Cohn (1985), Lemma 1.1.4). 


. Show that the group algebra of a free group is a semifir. 


Skew fields 


Skew fields arise quite naturally in the application of Schur’s lemma and elsewhere, 
but most of the known theory deals with the case of skew fields finite-dimensional 
over their centres (see Chapter 5). In the general case only isolated results are 
known, much of it depending on the coproduct construction (see Cohn (1995), 
Schofield (1985)). This lies outside our framework and we shall confine ourselves 
to presenting some of the highlights which do not require special prerequisites. 
After some general remarks in Section 9.1, including the Cartan~Brauer~Hua 
theorem, we shall give an account in Section 9.2 of determinants over skew fields. 
This is followed in Section 9.3 by a proof of the existence of free fields, based on 
the specialization lemma whose proof has been much simplified. Many of the con- 
cepts (though fewer of the actual results) of valuation theory carry over to skew fields 
and in Section 9.4 we shall examine the situation and pursue some of the results that 
continue to hold in the general case. The final section, Section 9.5, is concerned with 
the question when the left and right dimensions of a field extension are equal. We 
shall also meet examples of skew field extensions whose left and right dimensions 
are different (Emil Artin’s problem); they are most easily obtained as pseudo- 
linear extensions, a more tractable class of finite-dimensional extensions. 
Throughout this chapter we shall use the term ‘field’ to mean ‘not necessarily 
commutative division ring’; the prefix ‘skew’ is sometimes added for emphasis. 


9.1 Generalities 


Let K be a skew field. Its centre C is a commutative field, and the characteristic of C is 
also called the characteristic of K. We have already seen in Theorem 5.1.14 that every 
finite field is commutative. However, an infinite field may well have a finite centre, in 
fact, for every commutative field & there is a skew field with centre k. In characteristic 
0 we can adjoin variables u, v with the relation uv — vu = 1 to obtain such a field, but 
there is another construction which applies quite generally; a third construction, 
generally valid, is used in the proof of Theorem 9.3.3 below. 


Proposition 9.1.1. Let k be any commutative field. Then there is a skew field D whose 
centre is k, 


344 Skew fields 


Proof. Consider the group algebra of the additive group of rationals over 
k : k[x’|A € Q]. Clearly this is a commutative k-algebra; we take the subalgebra 
generated by x -.21=0,1,... and form its field of fractions 


E=k(x.x! 7 .x'’t....). 


This field has the automorphism a : f(x) 1 f(x"), which is of infinite order. More- 
over, k is the precise subfield fixed by a, for if f involves x, let 2" (n > 0) be the largest 
denominator in any power of x occurring in f. Then x’’~ occurs in f for some odd r, 
but it does not occur in f“, so fis not fixed under a. Now form the skew polynomial 
ring E| y: a] and let D be its field of fractions. By Theorem 7.3.6, D has the precise 
centre k. Bl 


Much of linear algebra can be done over a skew field (see BA, Chapter 4); this 
rests on the fact that every module over a field is free. It is well known (BA, 
Theorem 4.6.8) that this property actually characterizes skew fields. 

We go on to prove two general results, which although not used here, are of 
importance, with many applications. The first concerns normal subgroups of the 
multiplicative group of a field; the proof is based on an idea of Jan Treur. 


Theorem 9.1.2 (Cartan-Brauer—Hua theorem). Let K CL be skew fields, and 
assume that K“ 1s a normal subgroup of L’. Then either K = L or K is contained in 
the centre of L. 


Proof. For any c € L* the mapping a, : xi c~ 'xe satisfies 


(M+). = XO + ye, 


Further, we have xa. = x.(x.c), where (x,.c) =x~'c° 'xc € K, whenever c € K. If 


K #L, take a € L\K; for any c € K~, since (1. c) = 1, we have 
(a+ l\(a+1,¢)=al(a.c) +1. 


By the linear independence of 1, a over K, we have (a,c) = (a+1.c) = 1; thus 
ac =ca for all ce K, ag K. But if be K, then a+b ¢K, therefore bc — cb = 
(a+ b)c — c(a+b) — [ac — ca} = 0, hence c commutes with every element of L, 
i.e. K 1s contained in the centre of L. Ci 


This result was obtained in the case of division algebras by Henri Cartan in 1947 as 
a consequence of his Galois theory of skew fields. Richard Brauer and independently 
Hua Loo-Keng observed in 1949 that the general case could be proved directly. 
Our second result concerns additive mappings preserving inversion: 


Theorem 9.1.3 (Hua‘s theorem). Let o : K — L be a mapping between two skew 
fields such that 


(a+b)? =a° +b°.1%=1. (a )o= aoe (9.1.1) 


Then o 1s either a homomorphism or an antihomomorphism. 


9.1 Generalities 345 


Proof. (E. Artin) We must show that for all a, b € K, either (ab)” = a’b° or for all 
a.b € K, (ab)” = b°a°. We observe that a°(a~')° =1 by (9.1.1), soaf#0> 
a° ~ 0 and it follows that o is injective. We start from the following identity (Hua’s 
identity): 


a—(a~'+(b~!—a)7!)7!' = aba, (9.1.2) 


valid whenever all inversions are defined, i.e. ab 4 0. 1. To prove (9.1.2), we observe 
that for any x # 0. 1, 


(x ee Tea eee ee (9.1.3) 


as we see by multiplying out. Let ab # 0.1; thena ‘'(b~!—a) = (ba) | — 1, hence 
on taking x = ba in (9.1.3), we find 


(b~' —a)'a=((ba)~'-1) }=(1—ba)* 1, 
(b>! -—a)~'=(1—ba)'a°'-—a™~! = (a—aba)"'~a™'. 
Hence 
a—aba=(a~!+(b>'-—a)"')7'. 


and (9.1.2) follows from this on rearranging the terms. 
Now (a !)° = (a”) ' and we may denote both sides by a~”. If we apply a to 
(9.1.2) and observe that o is compatible with all the operations on the left, we obtain 


(abay" =a ba’. (9.1.4) 


Clearly this still holds if a or b is 0; if b = a~', then b” = a ° and both sides reduce 
to a°, so (9.1.4) holds in all cases. Put b = 1 in (9.1.4): 


(a")? = (a?) (91,8) 
Next replace a by a+ b in (9.1.5) and simplify, using (9.1.5) again: 
(ab+ ba)® = a°%b® + b%a’. (9.1.6) 
Now consider (c” ~ a%b’)c™ “(c° — b%a%) for any c #0. This equals 
c° —a7b" — ba? + a°b% co b% a? =c" — (ab + ba") + (abe™ 'ba®) 
= (ce —ab—ba+ abc’ ‘ba)°. 
by an application of (9.1.6), (9.1.4) and (9.1.1). Thus 
(CHa Db Ne OE Hb a") Sle = ab = ba Habe “hay: (9.1.7) 


for all a, b, c such that c#0. For c= ab the right-hand side reduces to 
ab — ab — ba + ba = 0, hence the left-hand side of (9.1.7) vanishes for c = ab. 
Thus (ab)° is either a%b® or b°a’; it only remains to show that the same alternative 
holds for all pairs. 

Fix aé€K and put U, = {be K|({ab)° =a7b*}, V, = {be K |(ab)" = b°a’}. 
They are clearly subgroups of the additive group of K whose union is K, hence for 


346 Skew fields 


each aé€K, one of them must be all of K (see Exercise 2). Now put 
U={aeK|U,=K}.V={aeK|V,=K}; then U, V are again subgroups 
whose union is K, so one of them must be all of K, i.e. either (ab)° = a%b° for all 
a.beéK or (ab)° = b’a’ for all a,be K. pe 


This result is used in projective geometry to show that a bijective transformation 
of the line which preserves harmonic ranges necessarily has the form x1-> ax” + b, 
where a # 0 and o is an automorphism or an antiautomorphism (for K = R this 
is von Staudt’s theorem, see E. Artin (1957) p. 37). 


Exercises 


1. Let k be a commutative field of characteristic 0. Verify that the field of fractions of 
the Weyl algebra A,(k) has centre k. What is the centre when k has prime 
characteristic? 

. In the proof of Theorem 9.1.3 the fact was used that a group G cannot be the 
union of two proper subgroups H, K. By considering the product of two elements, 
one not in H and one not in K, prove this fact. 

3. Let K be a skew field. Show that for any c € K the centralizer of the set of all con- 
jugates of c is either K or the centre of K. Deduce that no non-central element c 
can satisfy the identity cx~ 'cx = x ‘exe for allx EK’. 

4. Leto : K —> L bea mapping between fields such that (x + y)” =x” +y°, 1° =A, 
(x?)7 SAT (x7! )°X7~ |. Show that x° = x'A, where t is a homomorphism or 
an antihomomorphism. 

5. Show that any non-central element in a skew field has infinitely many conjugates. 

6. Let K be a skew field. Show that any conjugacy class of elements of K outside the 
centre generates K as a field. Show that any subfield containing K *“‘ coincides 
with K. 

7. (Herstein) Let K be a skew field of prime characteristic and G a finite subgroup of 
K*. Denoting by P the prime subfield of K, show that the P-space spanned by G is 
an algebra, hence a finite field. Deduce that G must be cyclic. 

8. Let K be a skew field. Show that any abelian normal subgroup of K * is contained 
in the centre of K. 


to 


9.2 The Dieudonne determinant 


The determinant is a fundamental concept, going back much further than the notion 
of matrix, on which it is based (matrices were introduced by Arthur Cayley in the 
middle of the 19th century, whereas determinants had been used at the end of the 
17th century by Gottfried Wilhelm von Leibniz). Today we regard the determinant 
of a square matrix as an alternating multilinear function of the columns of the 
matrix; its most important property is that its vanishing characterizes the linear 
dependence of its columns. Here the entries of the matrix are assumed to lie in a 
(commutative) field, but it is clear that the definition is unchanged when the entries 
are taken from any commutative ring. So it is natural to try to extend the definition 


9.2 The Dieudonne determinant 347 


to the non-commutative case; we have seen in Section 5.3 how this can be done for a 
finite-dimensional algebra by means of the norm. The general definition is due to 
Jean Dieudonneé [1943]; before presenting it let us examine the simplest case, that 
of a 2 x 2 matrix. 


a 
The columns of the matrix ( 
ra 


b 
4 over a skew field K say, are linearly dependent 


iff the equations 
ax+by=0. cx+dy=0. (9.2.1) 


have a non-trivial solution (x. 1). If a= 0, such a solution exists precisely when 
b=0 or c=0, so let us assume that a #0. Then in any non-trivial solution 
y £0 and by eliminating x from (9.2.1) we obtain (d — ca~ 'b)y = 0, hence the 
condition for linear dependence is 


d—ca 'b—0. (9.2.2) 


Depending on which entries of A are zero, we can find various expressions whose 
vanishing characterizes the linear dependence of the columns of A, but a few trials 
make it clear that there is no polynomial in a.b.c.d with this property. Thus 
we must expect a determinant function (if one exists) to be a rational function. 


a 0 
A second point is that under any reasonable definition one would expect ¥ .) 


b 0 
and (| to have the same determinant. This suggests that even for a skew field 
ra 


the values of the determinant must lie in an abelian group. For any field K we define 
the abelianized group K“" as 


Kiié — K “1K sie 


where K * is the derived group of the multiplicative group K *. Thus for a commu- 
tative field K’ reduces to K °. It is clear that K“” is universal for homomorphisms of 
K~* into abelian groups. 

As usual we write GL,,(K ) for the group of all invertible n x m matrices over K. Let 
us recall that any m x nm matrix A over K may be interpreted as the matrix of a linear 
mapping from an m-dimensional to an n-dimensional vector space over K. By 
choosing suitable bases in these spaces we can ensure that A takes the form I, © 0, 
where r is the rank of A. Thus there exist P € GL,,,(K ), Q € GL,,(K ) such that 


‘Tr, 0 
PAQ= { ) (9.2.3) 
0 0 


This was proved in the remark after Proposition 7.2.3. In particular, (9.2.3) shows 
that a square matrix over any field is invertible iff it is left (or equivalently, right) 
regular (see also Corollary 9.2.3 below). Such a matrix is also called non-singular 
and a non-invertible matrix is called singular. 

Without a determinant we cannot define SL,, but we have the group E,,(K ) 
generated by all elementary matrices B,,(c) = I + cE,, (see Section 3.5). We observe 


348 Skew fields 


that multiplication by an elementary matrix corresponds to an elementary operation 
on matrices; more precisely, left multiplication by B;;(c) amounts to adding c times 
the j-th row to the i-th row and right multiplication by B,,(c) means adding the 7-th 
column multiplied by c to the j-th column. 

As a first result we show that we can pass from AB to BA by elementary transfor- 
mations, provided that the matrices are ‘enlarged’ by forming their diagonal sum 
with a unit matrix. Here it is not necessary to assume that the coefficients lie in a 
field. 


Lemma 9.2.1 (Whitehead’s lemma). Let R be any ring and n> 1. For any 
A.B éGL,(R), AB @I and BA @®1 lie in the same (left or right) coset of E2,(R), a 
fact which may be expressed by writing 


AB QO BA 0 
( = ( ) (mod E;,,(R)). (9.2.4) 
O | 0 7 


Proof. We must show that 
A’'B'ABQ@I€e E>, (R). (9.2.5) 


In the first place we note that for any C € GL,,(R), 


) ey MCN of O\/!I. °C 
= ( )( ( ad E3,,(R), (9.2.6) 
=) 0 0 | Bek iO FT 


for each matrix on the right can be written as a product of n~ elementary matrices. 
Hence we have 


Co! 9 O -I 0 Cc 
( =( )( — | © Es, (R), (9.2.7) 
20 Gs I 0 me) Oy 
for the matrices on the right are instances of (9.2.6). Now we have 
“A-'B-'4B QO 4rl % B-i oO AB 0 
| = _, ) € Ex,(R). o 
: : 0 A/\ 0 B/\ 0 (AB) 


As we shall soon see, in the case of a skew field E,,(K) is the derived group 
GL,(K)’ (except when n = 2 = |K|), in particular, E, is a normal subgroup of 
GL,; for a commutative field k, E,,(k) = SL.(k) and the result was proved in 
Proposition 3.5.2. 

We can embed GL,(K) in GL,.\(K) by mapping A to A@ 1. In this way we 
obtain an ascending chain 


GLK) CGLAK)C..., 


whose unlon is again a group, written GL(K ) and called the stable general linear 
group. Its elements may be thought of as infinite matrices which differ from the 
unit matrix only in a finite square. Similarly the union of the groups E,(K) is a 


group E(K ). 


9.2 The Dieudonné determinant 349 


In order to obtain a definition for the determinant we shall need to refine the 
expression (9.2.3) for a matrix. A square matrix is called lower unitriangular if all 
the entries on the main diagonal are | and those above it are 0; it is clear from 
the definition that such a matrix is a product of elementary matrices and hence is 
invertible. Moreover, the lower unitriangular matrices form a group under multi- 
plication. An upper triangular matrix is defined similarly. We observe that left multi- 
plication by a lower unitriangular matrix amounts to adding left multiples of certain 
rows to later rows, and right multiplication by an upper unitriangular matrix comes 
to adding right multiples of certain columns to later columns. 

We can now describe a decomposition which applies to any matrix over a skew 
field. Our account follows that of Peter Drax] (1983), with some simplifications. 


Theorem 9.2.2 (Bruhat normal form). Let K be a skew field and A €'"K". Then A 
can be written in the form 


A. SLM, (9.2.8) 


where L is an m x m lower umitriangular matrix, U an n x n upper unitriangular 

matrix and M is an mx n matrix with at most one non-zero entry in any’ row or 

column. Moreover, any other such decomposition of A leads to the same matrix M. 
The matrix M 1s called the core of A and (9.2.8) is the Bruhat decomposition. For 


a 


b 
example, when A = ( f then the Bruhat decomposition 1s 


G 


( 1 O)\/a 0 te of) fax 
if a 
car? ale d—ca-'b/\0 1 


( ] ale i ‘ jae (" ale es , b=0+4 
1 — : vYaz= = Gs 
db-' ] c (0 c O/ KO ] 


0 0 a 
( ) ie b= C= 4; 
0 d 


Proof. Suppose that the first non-zero row of A is the i-th row and that its first non- 
zero entry 1s a,;. By adding left multiples of the i-th row to each succeeding row we 
can reduce every entry in the j-th column except a;; to 0. These operations corre- 
spond to left multiplication by a certain lower unitriangular matrix. Next we add 
right multiples of the j-th column to succeeding columns to reduce all entries in 
the i-th row except a,; to 0; these operations will correspond to right multiplication 
by an upper unitriangular matrix. As a result a;; is the only non-zero element in its 
row and column, and all the rows above the i-th are zero. Next we take the first non- 
zero row after the i-th, say the k-th row and with the first non-zero entry a,, in this 
row we continue the process. After at most 1 steps A has been reduced to the form 
M where each row and each column has at most one non-zero entry, with a left 
factor which is lower and a right factor which is upper unitriangular. This is the 
required decomposition (9.2.8). 


350 Skew fields 


If A= L’M'U’ is another such decomposition, then on writing P=L~é'I’, 
Q = U'U™!, we have 


PM'Q=M, 


where P is again lower and Q upper unitriangular. This tells us that we can pass from 
M’ to M by adding left multiples of rows to later rows and right multiples of 
columns to later columns. If m’. is a non-zero entry of M’, the application of 
these operations only affects entries in the r-th row or s-th column and leaves m‘, 
itself unchanged. Hence in any operation on M’, the (i.7)-entry is affected only if 
M‘ has either a non-zero entry in the i-th row before the j-th column, or a non- 


zero entry in the j-th column above the i-th row. In either case m;, = 0, while mi, 
remains unchanged. Hence m,,=m', and it follows that m,;; = 0. Therefore 
M’' = M, as we wished to show. | | 


We remark that the matrices L, U in (9.2.8) are not generally unique. For example, 


we have 
(’ HC ‘\() “ee (" ') 
a lJ\v o/X\0 | Ny of 


For a truly unique form see Exercise 4. 


Corollary 9.2.3. For any m x n matrix A over a skew field K the following conditions 
are equivalent: 


(a) A is left regular: XA = 0 => X = 0, 
(b) A has a right inverse: AB = I for some BE"K". 


Moreover, when (a), (b) hold, then m <n, with equality if and only if (a), (b) are 
equivalent to their left-right analogues. 


Proof. Either of (a), (b) clearly holds for A precisely when it holds for its core and it 
holds for the core iff each row has a non-zero entry. a 


Let us define a monomial matrix as a square matrix with precisely one non-zero 
entry in each row and each column, e.g. the core of an invertible matrix is a mono- 
mial matrix. If all the non-zero entries in a monomial matrix are 1, we have a 
permutation matrix; this may also be defined as the matrix obtained by permuting 
the rows (or equivalently, the columns) of the unit matrix. The determinant of a per- 
mutation matrix is 1 or —1 according as the permutation is even or odd. Sometimes 
it is more convenient to have matrices of determinant 1; this can be accomplished by 
using a signed permutation matrix, i.e. a matrix obtained from the unit matrix by a 
series of operations which consist in interchanging two columns and changing the 
sign of one of them. By (9.2.6) such a matrix is a product of elementary matrices. 

Any monomial matrix M may be written in the form 


M = DP, (9.2.9) 


9.2 The Dieudonne determinant 351 


where D is a diagonal matrix and P is a signed permutation matrix. By Corollary 
9.2.3 any matrix over a skew field is invertible iff its core is a monomial matrix. 
With the help of Theorem 9.2.2 we can also identify E,,: 


Proposition 9.2.4. For any skew field K and any n > 1, we have 


GL, (K) = D,,(K).E,(K). (9.2.10) 
where D,,(K ) 1s the group of all diagonal matrices in GL,,(K ). Moreover, for any n > 2, 
BE, (KP =GLAK). (9.2.11) 

except when n = 2, K = Fy, when E>(F2) = GL2(F2). 


Proof. By Theorem 9.2.2, any invertible matrix can be written as LDPU, where L is 
lower, U is upper unitriangular, D is diagonal and P is a signed permutation matrix. 
Now P can be written as a product of elementary matrices, using (9.2.6); moreover, 
LD = DL’, where L’ is again lower unitriangular, hence our matrix takes the form 
D.F, where F € E,,(K ), and this proves (9.2.10). 

To establish (9.2.11), let us write (A.B) = A~'B~'AB and (H.K) for the sub- 
group generated by all (A,B), A € H, Be K. We shall also write G,, for GL,(K ) 
and similarly for D,, E,. We first show that (G,,, G,) C E,; by (9.2.10) this will 
follow if we show that (D,,. D,,) C E,, and (D,,, E,,) C E,. The first inclusion follows 
from Lemma 9.2.1 (because n > 2), while the second results from the formula 


(“. 0 )(, rele J ¢ (_ eas 
Ob wee: 4 0 v/\o 1/7 \o ] 


In the other direction we have 


l a ra ON FP HN Ll bal —¥)° 
Gs eG) ae 
0 1 No vi Xo 1 0 


provided that a = b(1 — 1). If K # F>, there is an element 1 4 0.1 and putting 
b=a(l—v)~', we can use (9.2.12) to express B,,(a@) as a commutator. Hence 
E, © G, whenever K # F). If n > 3, we have 


l a OQ 1 0 1] 1 0 0 
O01 0]J= 0 1 O}7.70 1 +0 
0 0 | 0 0 1 O a l 


so we again have E, CG) and (9.2.11) follows. When K = F> and n = 2, then 
GL,»(F,) is the symmetric group of degree 3 and is equal to E,(F>), as is easily 
verified. ei 


We now define for any skew field K, a mapping 6 : GL(K ) > K“” from the stable 
linear group to the abelianized group of K, as follows: for A € GL,(K )d(A) = 
[ |). , di, where the d; are the diagonal elements of D in the expression (9.2.9) for the 


core of A and d; is its residue class mod K ”* ’. The value 4(A) is called the Dieudonne 
determinant, or simply the determinant of A. To obtain its properties we shall need 


352 Skew fields 


to find how it is affected by permutations of its rows; for simplicity we consider the 
effect of signed permutations. 
Lemma 9.2.5. For any A € GL,,(K ) and any signed permutation matrix P, 

6(PA) = 8(A). 
Proof. By induction it will be enough to prove the result when P is the signed per- 
mutation matrix corresponding to a transposition, (r.s) say, where r < s. Denote 


this matrix by Py and take a Bruhat decomposition (9.2.8) of A, where the core is 
factorized as in (9.2.9): 


A= LDPU, 
and denote the (s, r)-entry of L by b. We have 
PyA = PyLDPU = L’B,,( — b) Pp DPU 
= L'D'B,<(c)PyPU. (9.2.13) 


where L’, D' are again lower unitriangular and diagonal respectively, and D’ differs 
from D by an interchange of the r-th and s-th diagonal elements. If ¢ = 0 (which is 
the case iff b = 0) or if the permutation corresponding to PoP preserves the order of 
r, s, then the matrix on the right takes the form L’D’P)PB,,.(c)U; this is again in 
Bruhat normal form and so we have 46(P,)A) = 5(A) in this case. 

Suppose now that c #4 0 and that PoP inverts the order r, s. Then the formula 


Eade SFist ath a 
0 | 1 of}; \er! 4 0 c !/\0O ] 
shows that, on writing D,(c) for the matrix differing from the unit matrix only in the 
(1,1)-entry, which is c, we have 
Bys(c) Po — By, (¢ a JD (cyD ve 7 Bet a ol ‘ye 
Inserting this in the expression (9.2.13) for PyA, we find 
5(PyA) = ¢.c 7 '8(A) = (A), 


and this proves the assertion in all cases. The conclusion follows by induction. 
We can now establish the main property of the determinant: 


Theorem 9.2.6. For any skew field K, the determinant function 6 is a homomorphism 
giving rise to an exact sequence 


; ; 
1» E(K) + GL(K) — K" > 1. 
In particular, the determinant is unchanged by elementary row or column operations. 


Proof. Let A € GL(K ); we first show that 6(BA) = 6(A) for any B € E(K), and by 
induction it is enough to prove this for B = B,,(a). If i > j, this matrix is lower 


9.2 The Dieudonne determinant 353 


unitriangular, so BA and A have the same core and hence the same determinant. If 
i < j, let Py be the signed permutation matrix corresponding to the transposition 
(1,7). Using Lemma 9.2.5 and the case just proved, we have 


8(B,)(a)A) = 6(PoB,,(a)A) = 8(Bji(a)PoA) = 6(PyA) = d(A). 


hence the result holds generally. The same argument applies for multiplying by an 
elementary matrix on the right. Now Lemma 9.2.1 shows that 


6(AB) = 6(BA) for any A, Be GL({K). (9.2.14) 


Here it is of course necessary to consider 5 as being defined on the stable linear 
group. 

Now take any two matrices A,B with Bruhat decompositions A =1,M,U,, 
B= 1L,M.U2. We have, by what has been proved and (9.2.14), 


6(AB) —_ d(L,;M,U,L.M2U3) = 6(M,U,L.+M>) — 6(MoM, UL) = 6(MM,). 


Further it is clear from the definition of 6 that 6(M.M,) = 6(M,M>) = 6(M,)6(M2), 
so we obtain 


6(AB) = 6(A)d(B). (9.2.15) 


This shows 6 to be a homomorphism. Clearly its image is K%’; its kernel includes 
GL(K )’ = E(K) because K®” is abelian. Conversely, if 6(A) = 1, then the core of 
A has the form DP, where P is a signed permutation matrix and so 1 = é(A) = 
6(D). By (9.2.7) we can apply elementary operations to reduce D to the form D,(c), 
where c is the product of the diagonal elements of D. But by hypothesis c = 1, hence 
D has been reduced to 1 and so A € E,,{(K ), as we wished to show. oi 

Recently another more general form, the quasideterminant, has been defined by 
Izrail Gelfand and Vladimir Retakh [1997], which is essentially a rational expression 
defined recursively in terms of the » ~ 1 x n — 1 submatrices. 


Exercises 


1. Show that the transpose of every invertible matrix over a field K is invertible iff K 
is commutative. (Hint. Try a 2 x 2 matrix with (1, 1)-entry 1.) 

2. Use Theorem 9.2.2 to show that GL,(R) = D,,(R)E,(R) for any local ring R. 

3. Show that GL.(F.) = E>(F;) = Sym,. 

4. Let K be a skew field. Show that A € GL,,(K ) can be written as A = LDPU, where 
L is lower, U is upper unitriangular, D is diagonal, P is a permutation matrix and 
PUP ~! is also upper triangular. Moreover, such a representation is unique (Draxl 
(1983); this is known as the strict Bruhat normal form). 

5. Show that a homomorphism GL, K ) — Sym, can be defined by associating with 
A € GL,(K ) the permutation matrix P from the representation A = LDPU. 

6. Show that if P is the permutation matrix obtained by applying a permutation a to 
the rows of J, then it can also be obtained by applying a” ' to the columns of I. 


354 Skew fields 


7. Let K be a skew field with centre C. Show that 6 restricted to C reduces to the 
usual determinant, provided that no element of C other than I is a product of 
commutators. 


9.3 Free fields 


We have seen that free rings and free algebras may be defined by a universal property; 
this is not to be expected for fields, since fields do not form a variety (see Theorem 
1.3.7). Nevertheless, in the commutative case the rational function field k(x,.....: x1) 
may be regarded as a ‘free’ field in the sense that all other fields generated by d 
elements over k can be obtained from it by specialization. Moreover, it is the field 
of fractions of the polynomial ring k[x,...... x;] and as such it is uniquely deter- 
mined. By contrast, the free algebra k(x,,...,.%,) has more than one field of fractions 
(see Exercise 2), but this leaves the question whether in the general case there exists 
a field that is universal tor specializations. A full study of these questions is beyond 
the scope of this book (see Cohn (1985), Chapter 7), but it is possible to prove the 
existence of free fields in a relatively straightforward way, and this will now be done. 

For a general theory of fields it is convenient to invert matrices rather than 
elements. We shall not enter into details, but we have to consider which matrices 
can become invertible under a homomorphism to a field. Clearly we can confine 
ourselves to square matrices. If A is an n x n matrix over a ring R, and A can be 
written in the form 


A= PQ, where Pe"R’, QE'R’. (9.3.1) 


then it is clear that under any homomorphism from R to a field we again have a 
factorization as in (9.3.1), hence the image of A cannot have a rank greater than r, 
and so cannot be invertible when r < n. The least possible value of r in a factoriza- 
tion of A as in (9.3.1) is called the inner rank of A over R and is denoted by (pA. It is 
easily verified that over a field (even skew) the inner rank reduces to the usual rank. 
Thus an xX nm matrix over any ring cannot become invertible under a homo- 
morphism to a field, unless its inner rank is n. 

A square matrix over any ring R is said to be full if its inner rank equals the 
number of rows. The above remarks show that in studying matrices that can be 
inverted under a homomorphism to a field, we can confine our attention to full 
matrices. Our aim in this section is to show that for the tensor ring F there exists 
a field U containing F and generated by it as field, such that any full matrix over 
F can be inverted over U. This field U is a universal field of fractions of F, in the 
sense that there is a specialization from U to any other field which is obtained by 
a homomorphism from F (Cohn (1985), Chapter 7). 

We begin with some remarks on the inner rank. We recall from BA, Section 4.6, 
that a ring R is called weakly finite if for any square matrices of the same size 
A.B, AB =I implies BA =I. If R is a weakly finite ring which is non-trivial, then 
the unit matrix in R must be full, i.e. if A is r x s and B is s x rand AB =I, then 


9.3 Free fields 355 


r <s. For if r > s, we can adjoin zero columns to A and zero rows to B to obtain 
square matrices. Now we have 


B B 
(A 0( Jan hence ( ua i e—— es 
0 0 


Comparing (r, r)-entries, we obtain 0 = 1, which contradicts the fact that R is non- 
trivial. 


Lemma 9.3.1. Let R be a non-trivial weakly finite ring and consider a partitioned 


matrix over R: 
A; A> 
A= . where A; isr xr. 
A3 Ay 


If A, 1s invertible, then pA > r, with equality if and only if AxA, Ax aay: 


Proof. Clearly the inner rank is unchanged on passing to an associated matrix. Hence 
we can make the transformation 


I A; ‘A> I A, 'A: I 0 
A-> => —_ } , 
A; A4 Q Ay = A3A, lA, 0 Ay ae A3A, A> 
and these transformations leave the inner rank unchanged. If oA = s, this matrix can 
be written as 


P; 


PO = ae 
Q (5 ia Qs) 


where P, is r x s and Q, iss x r. Thus I = P,Q), hence r < s by weak finiteness, and 
this shows that pA > r. When equality holds, we have Q;P, = I, but P}Q. = 0, so 
Q, = 0. Similarly P: = 0 and hence Ay ~ A3A, ‘A> = P2Q: = 0. The converse is 
clear. | = | 


The existence proof for free fields is based on a lemma of independent interest, the 
specialization lemma. This may be regarded as an analogue of the GPI-theorem 
(Theorem 7.8.3), which is used in the proof; its commutative counterpart is the 
elementary result that a polynomial vanishing for all values in an infinite field 
must be the zero polynomial. In the proof we shall need the result that any full 
matrix over the tensor ring D(X) remains full over the formal power series ring 
D((X)) (Lemma 5.9.4 of Cohn (1985) or Proposition 6.2.2 of Cohn (1995)). 


Lemma 9.3.2 (Specialization lemma). Let D be a skew field with infinite centre C 
and such that |[D: C] is infinite. Then any full matrix over the tensor ring D¢-(X) 1s 
invertible for some choice of X in D. 


Proof. Let A = A(x) be any full mn x n matrix over D, (X) and denote by r the 
supremum of its ranks as its arguments range over D. We have to show that 


356 Skew fields 


r = n, so let us assume that r < n. By a translation x > x + a(x € X.a € D) we may 
assume that the maximum rank is assumed at the point x = 0, and by an elementary 
transformation we may take the principal r x r minor to be invertible. Thus if 


en pond 
A(x) = . 
A3(x) Ay(x) 


where A; is r x r, then A,(O) is invertible. Given a € D* and any t € C, we have 
pA(ta) <r, hence by Lemma 7.2.4, the rank of A(ta) over D(t) is at most r, and 
the same holds over D((t)), the field of formal Laurent series 1n f as central indeter- 
minate. Now Aj;(ta) is a polynomial in t with matrix coefficients and constant term 
A:(0), a unit, hence A, (fa) is invertible over the power series ring D|{t]]. By Lemma 
9.3.1, the equation 


A,(ta) = A;(ta)A,(ta) 'A>(ta) (9.3.2) 


holds over D{[t]], for all a € D*. This means that the matrix 


A,(tx) — A3(tx)A) (tx) 'Ao(tx) (9.3.3) 


vanishes when the elements of X are replaced by any values in D. Now (9.3.3) is a 
power series in t with coefficients that are matrices over D(X). Thus the coefficients 
are generalized polynomial identities (or identically 0), so by Amitsur’s GP]-theorem 
(Theorem 7.8.4), the expression (9.3.3) vanishes as a matrix over D,-(X){{t]]. It 
follows that for r <n, A(tx) is non-full over D,-(X){[t]]. Hence we can write 
A(tx) as a product PQ, where P is n x r and Q is r x n and P, Q have entries from 
De (X)[[t]]; putting t = 1, we obtain a corresponding factorization 


A(x) = PQ (9.3.4) 
over De ((X)) which by the result quoted can be taken over Dc (X). Thus A(x) is non- 


full over De{X), a contradiction, which proves the result. Gj 


In this lemma the condition [D: C] = x is clearly necessary; whether the con- 
dition that C be infinite is needed is not known. 
We can now prove the existence of free fields: 


Theorem 9.3.3. Let D be a skew field with centre C and X anv set. Then De(X) can be 
embedded in a field U, generated by D,;:(X), such that every full matrix over De{X) 
becomes invertible over U. 


Proof. Suppose first that [D: C] = oo, |C|] = oo, and consider the mapping 
DAD, 


where p € D..(X) is mapped to (p;), with p; = p(xf), for any f € D*. With each 
square matrix A over Dc(X) we associate a subset #(A) of D* defined by 


QGA)={fe D*\A(xf ) is invertible}. 


9.3 Free fields 357 


YA) is called the singularity support of A. Of course (A) = @ unless A is full, but 
by Lemma 9.3.2, Y(A) # @ whenever A is full. If P, Q are any invertible matrices, 
then P © Qs invertible, hence A(x) @ B(x) becomes singular precisely when A(x) or 
B(x) becomes singular, thus 


ZA) GZ(B) = YA @B). 


It follows that the family of sets “(A), where A is full, is closed under finite intersec- 
tions. Hence it is contained in an ultrafilter F on D* (see Section 1.5), and we have 
a homomorphism to an ultrapower 


De SD? 7, (9.3.5) 


where by definition, every full matrix A over D;-(X) is invertible on &(A) and so is 
invertible in the ultrapower. Hence the subfield of the ultrapower generated by 
D,-(X) is the required field U. 

In the general case we take indeterminates r, s, t and define D, = D(r), 
D» = D,(s). On Dz we have an automorphism a: f(s)i>f(rs), with fixed field 
D,. We now form F = D,(t:@); the centre of E is the centre of D,, namely C(r) 
(see Theorem 7.3.6). This is infinite and E has infinite dimension over C(r), because 
the powers s” are linearly independent. It is clear that we have an embedding 
D(X) — Ey (X), and it follows from the inertia lemma (Lemma 8.7.3) that this 
embedding is honest, i.e. full matrices are mapped to full matrices. Hence on 
taking the field U constructed earlier for E,;,,(X), we obtain a field over which 
every full matrix over D(X) becomes invertible. EI 


The field U whose existence has been established in Theorem 9.3.3 is denoted by 
D.-({X)) and is called the universal field of fractions of D¢(X) or also the free field 
over D with centre C (its centre can be shown to be C). The existence proof for 
free fields goes back to Shimshon Amitsur [1966], who used his results on general- 
ized rational identities. The existence of such a universal field of fractions can be 
proved more generally for any tensor ring K;(X), where K is any skew field and L 
any subfield of K. This is a special case of the fact that every semifir has a universal 
field of fractions over which every full matrix can be inverted (see Cohn (1985), 
Chapter 7). 

We remark that any automorphism of D-(X) is honest and therefore extends to an 
automorphism of U. Further, by representing derivations as homomorphisms from 
D_-(X) to Uy we see that for any automorphisms a, f of U, any (a@. B)-derivation of 
D(X) extends to one of U. 


Exercises 


1. Show that the n x » unit matrix over a ring R is full iff R” cannot be generated by 
less than n elements. Show also that a non-trivial weakly finite ring has IBN. 

2. Let E=k(t) be the field of rational functions in one variable rt, with the 
endomorphism a, : f(t) f(t") (r > 1). Show that the subalgebra of E[x: a, ] 
generated by x and y= xt is free on x, y. Using Exercise 2 of Section 7.3, 


358 Skew fields 


obtain for each r > 1 a field of fractions L, of the free algebra F. Show that these 
fields are non-isomorphic as F-rings (J. L. Fisher). 

3. Show that an endomorphism 4 of Dc (X) can be extended to an endomorphism of 
the free field iff 6 is honest. 

4. Show that every honest endomorphism is injective; give an example of an endo- 
morphism of the free algebra k({X) which is injective but not honest. 

5. Verify that over a skew field the inner rank agrees with the rank. 

6. Let K be a skew field with infinite centre. Show that for any square matrix A over 
K there is an element @ in K such that A — a@/ is non-singular. For a finite field F 
find a matrix A such that A — xI is singular for all values of x in F (for infinite 
fields with finite centre the question remains open). 

7. Show that over a PID, a square matrix is regular iff it is full. Give an example of a 
square matrix over a free algebra which is regular but not full. (Hint. Try a3 x 3 
matrix with a 2 x 2 block of zeros.) 


9.4 Valuations on skew fields 


Valuations may be defined on skew fields as in the commutative case, but there have 
so far been fewer applications. This is no doubt due to the inherent difficulties in 
handling general valuations; however in special cases they become more tractable 
and offer the prospect of a means of gaining information on skew fields. Here we 
present a part of the general theory that runs parallel to the commutative case, 
together with some illustrations. 

Let K be a skew field. A subring V of K is said to be total if for every a € K”, either 
aora’' lies in V. If for everya € K*,a~'Va = V, then Vis called invariant. Nowa 
valuation ring of K is an invariant total subring of K. In any valuation ring V in K the 
set m of all non-units is easily seen to be an ideal, hence V is a local ring with m as 
maximal ideal. The set U of all units in V is a normal subgroup of K ~; we shall 
denote the quotient K */U by [and call it the value group of V, with natural homo- 
morphism v: K* — I. We shall use additive notation for [; our main concern 
will be the case when [ is abelian. Given a,b € K™, we shall write v(a) > v(b) 
iff ab-' € V, or equivalently (because V is invariant), b~'a € V. This relation is 
a total ordering on IT, for if v(a) > V(b), v(b) > v(c), then ab~', be! € V, 
hence ac”! € V and so v(a) > v(c). Clearly v(a) > v(a) and if v(a) > v(b) and 
v(b) > v(a), then ab~', ba~' € V hence ab~' € U and so v(ab~') = 0, hence 
v(a) = v(b). Under this ordering I becomes an ordered group, for if v(a) > v(b), 
then ab-'e€V, hence for any céK, ac(bc)”' =ab~! € V, ca(cb) 7! = 
clab-'c”'!€V, hence v(ac) > v(be), v(ca) > v(cb). This is a total ordering, 
because V is a total subring. Moreover, since v is a homomorphism, we have 
v(ab) = v(a)+(b), and the fact that ab-'e€ V>ab~'+16€V implies that 
v(a) > v(b) = v(a+ b) > v(b). Thus v obeys the following rules: 


V.1 v(x) € T for x € K, 


V.2 vix+y) > min{y(x). v(y)}, 
V3 v(xy) = v(x) + V(y). 


9.4 Valuations on skew fields 359 


Here we had to exclude the values x, y = 0, but it is more convenient to allow x = 0 
and define (0) = oo. Then V.1—V.3 continue to hold if we define (as usual) 
ot+a=ato=wt+o0o=nx, a < oo for all ae lf. As in the commutative 
case we have equality in V.2 whenever +(x) # v(y) (all triangles are isosceles’). 

A function v from K to an ordered group T° (with +(0) = 90), satisfying V.1-V.3 
is called a valuation on K. Given such a valuation, we can define 


V = {x e€ K |v(x) => 0}. 


and it is easily verified that V is a valuation ring on K. In this way valuation rings on 
K and valuations correspond to each other; to make the correspondence bijective we 
define two valuations 1, 1’ on K with value groups I’. T’’ to be equivalent if there is an 
order-preserving isomorphism »: Tl — I’ such that 


v(x)g=v(x) forallxeK*%. 


With this definition we have 


Theorem 9.4.1. On any field K there ts a natural bijection between valuation rings and 
equivalence classes of valuations on K. 


Proof. This is an easy consequence of the above remarks and may be left to the 
reader to prove. 2 


Valuations on skew fields were introduced by Otto F. G. Schilling in 1945. In the 
commutative case there is a third notion, equivalent to the above two, namely that 
of a place; this can also be defined for skew fields, but will not be needed here (see 
Exercise 2). 

We note that K itself is a valuation ring in K; it corresponds to the trivial valua- 
tion, defined by v(x) = 0 for all x 4 0, with trivial value group. Of course we shall 
mainly be interested in non-trivial valuations. 

The simplest (non-trivial) type of ordered group is the infinite cyclic group. It can 
be shown that a valuation has the infinite cyclic group as value group precisely when 
its valuation ring is a principal ideal domain; such a valuation is called principal. For 
example, the usual p-adic valuation on Q is principal, and in Chapter 9 of BA we saw 
that every valuation on a rational function field k(t) which is trivial on k is principal. 

Let K be any field with a valuation v and write V for its valuation ring, m for its 
maximal ideal and U for the group of units in V. It is clear from V.2 that every 
element of the form 1+ x, where x € m, is a unit; such a unit is called a 1-unit 
(Einseinheit). Thus u is a l-unit whenever 1(u — 1) > 0. The group of all 1-units 
is written 1 +m or U,. Let us denote V/m, the residue class field of V, by k. Then 
we have a group isomorphism 


k” = U/U,. (9.4.1) 
while the value group of v is given by 


Pek. (9.4.2) 


360 Skew fields 


These isomorphisms may be combined into the following commutative diagram 
with exact rows and columns: 


l i 
1 1 
l- U, ~ U, > 21 
1 y 1 
l1> U>~ K* +» Foil (9.4.3) 
1 1 1 
l-+>k* ~K*/U; > Fol 
1 v | 
Nt l l 


When V is principal, FP is infinite cyclic, so then the horizontal sequences split and 
K*/U, = k* xP. (9.4.4) 


For this reason the rows of (9.4.3) add nothing to our knowledge in the cases usually 
encountered, but in general, especially with a non-abelian value group, (9.4.3) 
provides more information about K”*. 

In constructing valuations it is helpful to know that any valuation on an Ore 
domain has a unique extension to its field of fractions. 


Proposition 9.4.2. Let R be a right Ore domain with field of fractions K. If v 1s a valua- 
tion on R satisfying V.1-V.3, then v has a unique extension to K. 


] 


Proof. If 1 is to have an extension to K, then for p= as” ° € K we must have 


v(p) = v(a) — v(s). (9.4.5) 


Suppose that as” ' = a,s, '. Then there exist u.u, € R such that su, = s\u 40, 
au, =au; hence a=0<> a, =0, and when a,a, #0, then —v(a;)+1(a) = 
v(iu) — v(u)) = —v(s;) + ¥(s) and so v(a) — v(s) = v(a,;) — v(s,). This shows the 
definition (9.4.5) to be independent of the particular representation as~' of p. 
Now it is easily verified that v so defined satisfies V.1—-V.3 on K. El 


Examples of valuations 


1. Let K be any field with a valuation 1. We can extend v to the rational function 
field K(x) by defining » on any polynomial f = 5° x'a, (a, € K) by the rule 


vf.) = min{v(a,)}. (9.4.6) 


and using Proposition 9.4.2 to extend v to K(x). This is called the Gaussian 
extension of v; the value group is unchanged while the residue class field under- 
goes a simple transcendental extension. The same construction works if instead of 
K(x) we use K(x: a), the skew function field with respect to an automorphism @ 
of K, provided that 1(a*) = 1(a) for all a € K. 


9.4 Valuations on skew fields 361 


If we enlarge the value group I by an element 6 having no non-zero multiple in 
I (e.g. by forming the direct product of P and the infinite cyclic group on A, with 
the lexicographic ordering), and instead of (9.4.6) define 


v(f-) = min{v(a;) + 1d}, (9.4.7) 


we obtain an extension, called the x-adic extension (provided that 5 > 0), with the 
same residue class field and enlarged group. 

2. Consider the free field k((x.y)); let E be the subfield generated over k by 
yj = x~'yx' (i € Z). The conjugation by x defines an automorphism @ which 
maps E into itself, the ‘shift automorphism’ y; | y;,.;. Moreover, k((x, y)) may 
be obtained as the skew function field F(x: a). Taking the x-adic extension of 
the trivial valuation on E, we obtain a principal valuation on k((x, y)) with residue 
class field E. We note that whereas the general aim of valuation theory is to obtain 
a simpler residue class field, E is actually more complicated than the original field. 
As we shall see, in order to simplify the residue class field we must allow a more 
complicated value group. 

3. In a skew field it may happen that an element is conjugate to its inverse: 
y ‘xy =x7!'. For example, let a be the automorphism of the rational function 
field F = k(x) defined by f(x) 1 f(x ~') and put E = F(y: a). If vis any valuation 
on FE, then v(x) = 0, for if v(x) 4 0, say v(x) > 0, then vix7!) < 0, but x7! = 
y ‘xy, hence v(x7!) = —v(y) + v(x) + v(y) > 0, a contradiction. In fact fields 
exist in which every element outside the centre is conjugate to its inverse (e.g. 
the existentially closed fields, see Cohn (1995)). It is clear that such a field can 
have no non-trivial valuation. 


One of the main tools of the commutative theory is Chevalley’s extension theorem 
(see BA, Section 9.5), which allows one to construct extensions for any valuations 
defined on a subfield. Such a result is not to be expected in general, but an analogue 
exists when the value group is abelian, and it is no harder to prove. 

A valuation on a field K is said to be abelian if its value group is abelian. For any 
field K we denote the derived group of K * by K‘. It is clear that any abelian valua- 
tion on K is trivial on K‘. As an almost immediate consequence we have 


Lemma 9.4.3. Let K be a skew field with a valuation v and valuation ring V. Then v 1s 
abelian if and only if V D K* or equivalently, v(a) = 0 for all a € K‘. Moreover, any 
subring A of K such that A D K‘ ts invariant, and any ideal in A is invariant. 


Proof. The second sentence follows from the fact that v is abelian iff the unit group 
of V contains K‘. To prove the last sentence, take any ae A”, be K’; then 
b~'ab =a.a~'b~'ab € A, and similarly for any ideal of A. o 


To state the analogue of Chevalley’s theorem (Lemma 9.4.4) we require the notion 
of domination. On any field K we consider the pairs (R, a) consisting of a subring R 
of K and a proper ideal a of R. Given two such pairs P; = (Rj, a;) (i= 1, 2), we say 
that P; dominates P>, in symbols P, > P2 if Rj D Ry and a, D az and write P, > P; 
(as usual) for proper domination, i.e. to exclude equality. If the pair (R, a) is such 


362 Skew fields 


that R > K‘, then every element of K‘ is a unit in R (because K‘ is a group), and so 
K‘Ma= @. The essential step in our construction is the following 


Lemma 9.4.4, Let K be a skew field, R a subring containing K* and a a proper ideal in 
R. Then there is a subring V with a proper ideal m such that (V.m) is maximal among 
pairs dominating (R, a), and any such maximal pair (V,m) consists of a valuation ring 
and its maximal ideal. 


Proof. This is quite similar to the commutative case (BA, Lemma 9.4.3); we briefly 
recall it to show where changes are needed. 

The pairs dominating (R. a) form an inductive family, so a maximal pair exists by 
Zorn’s lemma. If (V, m) is a maximal pair, then m is a maximal ideal in V, and since 
V > K‘, Vand mare invariant. To show that V is a total subring in K, take c € K; if 
c ZV, then V[c] > V, so if the ideal m’ generated by m in V[c] is proper, we have 
(V[c],m’) > (V. m), contradicting the maximality of the latter. Hence m’ = V[c] 
and we have an equation 


ag tayco+...+a,0° =1, a,em. (9.4.8) 


Here we were able to collect powers of c on the right of each term because of the 
invariance of m, using the equation cb = cbc ~ '.c. 
Similarly if c”' ¢ V, we have 


by + bc! +...+ b,c Pes I b; € m. (9.4.9) 


We assume that #7, 1 are chosen as small as possible and suppose that m > n, say. 
Multiplying (9.4.9) on the right by c’”, we obtain 


(Lhe SRC" Oe wae be. (9.4.10) 


By the invariance of V, xc = c.xy for all x € V, where y = y(c) is an automorphism 
of V which maps m into itself. If we multiply (9.4.8) by 1 — bp on the left and (9.4.10) 
by ay" on the right and substitute into (9.4.8), we obtain an equation of the same 
form as (9.4.8) but of degree less than m, a contradiction. This proves V to be total, 
and hence a valuation ring; from the maximality it is clear that m is the maximal 
ideal on V. P| 


We now have the following form of the extension theorem: 


Theorem 9.4.5. Let K C L be an extension of skew fields. Given an abelian valuation v 
on K, there is an extension of v to L if and only if there is no equation 


Y aic, =1,. wherea, € K.v(a;) >Oandc, €L’. (9.4.11) 


Proof. If there is an equation (9.4.11), then any abelian extension w of v to L must 
satisfy w(ajc,) = wla;) = v(a,) > 0, hence w(1) > min{w(a;c;)} > 0, a contradiction. 
Conversely, if no equation (9.4.11) holds, this means that if V is the valuation ring of 


9.4 Valuations on skew fields 363 


v, with maximal ideal m, then mL‘ is a proper ideal in VL‘, and by Lemma 9.4.4 there 
is a maximal pair (W, n) dominating (VL‘, mL‘). Now W is a valuation ring satisfy- 
ing WOK DV, nN K Dm, hence WNK=V and so W defines the desired 
extension. g 


To make valuations more tractable we shall require an abelian value group and 
commutative residue class field. It is convenient to impose an even stronger con- 
dition, as in the next result: 


Proposition 9.4.6. Let K be a skew field with a valuation v, having valuation ring V, 
maximal ideal m and group of 1-units U,. Then the following conditions are equivalent: 


(a) K*/U, ts abelian, 
(b) K°Cl+m=U,, 
(c) vil —a) > 0 forallae K‘. 


Moreover, when (a)—(c) hold, then the value group and residue class field are 
commutative. 


Proof. This is an almost immediate consequence of the definitions, and the last 
sentence is clear from a glance at the diagram (9.4.3). |= | 


A valuation satisfying the conditions of Proposition 9.4.6 will be called quasi-com- 
mutative. The following condition for extensions of quasi-commutative valuations is 
an easy Consequence: 


Theorem 9.4.7. Let K be a skew field with a quasi-commutative valuation v, and let L 
be an extension field of K. Then v can be extended to a quasi-commutative valuation of 
L if and only if there 1s no equation in L of the form 


Yi ap+ > bilgi -D=1. (9.4.12) 


where a,b, € K. v(a,) > 0. (bj) = 0. p,. q, € L’. 


Proof. If there is a quasi-commutative extension w of v to L, then for a,. bj. p,, qj as 
above we have 


w( So ap, + AC - 1)) > min{v(a,) + w(p;). v(b,) + wlg, — 1)} > 0, 


because v(a,) > 0, w(g; — 1) > 0. It follows that no equation of the form (9.4.12) can 
hold. Conversely, assume that there is no equation (9.4.12) and consider the set q of 
all expressions )° ajp, + }~ b,(q; — 1), where aj, bj, p;. qj are as before. It is clear that 
q is closed under addition and contains the maximal ideal corresponding to the 
valuation v. Moreover, q is invariant in L, i.e. u~'qu—q for all u € L*, because 
wu laypju = aj.a,'u~'au.u~'p;u € VL‘, and similarly for the other terms. In the 
same way we verify that q admits multiplication. We now define 


T SAce Lica: Ga}. (9.4.13) 


364 Skew fields 


It is clear that T is a subring of L containing L‘, and it also contains the valuation 
ring V of vy, for if c € V and we multiply the expression on the left of (9.4.12) by 
c, we obtain )- ca,p; + }°cb;(q; — 1); this is of the same form, because 1(ca;) = 
v(c) + v(a;) > 0 and v(cb;) > 0. Moreover, q is an ideal in T, for we have cq C q 
for all c € T, by the definition of T, and qc = c.c~ 'qc = cq C q. Since | ¢ q by 
hypothesis, q is a proper ideal in T. Thus T is a subring of L containing L and the 
valuation ring of v. By Lemma 9.4.4 we can find a maximal pair (W, p) dominating 
(T.q) and W is a valuation ring such that W > T DL‘, while 1+p21+q2>L'. 
Hence the valuation w defined by W extends v and is quasi-commutative, by 
Proposition 9.4.6. B 


Let D be a skew field with centre C and let X be any set. If D has a quasi- 
commutative valuation v, one can use Theorem 9.4.7 to extend v to a quasi- 
commutative valuation of the free field D((X)), but this requires more detail on how 
free fields are formed. In essence one uses the specialization lemma to show that if 
there is an equation (9.4.12), then X can be specialized to values in D so as to 
yield an equation (9.4.12) in D, which is a contradiction (see Cohn [1987], [1989]). 


Exercises 


1. Show that any total subring of a field is a local ring. 

. A place ofa field K in another, L, is defined as a mapping f : K — LU {oo} such 
that f '(L) = V is an invariant subring of K, the restriction f|V is a homo- 
morphism and xf = co implies x # 0 and (x~')f = 0. Show that V is a valuation 
ring and that conversely, every valuation ring on K leads to a place of K in the 
residue class field of V. Define a notion of equivalence of places and show that 
there is a natural bijection between valuation rings on K and equivalence classes 
of places. 

3. Verify that a valuation on a field K has value group Z iff its valuation ring is a 

PID. 

4. Let F = D, (X) and denote by U the free field D-({X)). Form U(t) with a central 
indeterminate ft and define a homomorphism A: U — U(t) as the identity on D 
and mapping x € X to xt. Let 1 be the t-adic valuation on U(t) over U (1.e. trivial 
on U) and put v( p) = v4(pa) for p € U. Verify that v is a valuation on U; find its 
value group and residue class field. 

5. Let K be a field with a valuation v whose value group I is a subgroup of R. Define 
an extension of v to the rational function field K(x) by (9.4.7), where 6 € R, 5 > 0, 
and find the new value group and residue class field. Distinguish the cases 
Cr 5Z = {0}, PF N4Z F {0}. 

6. Let E be the field of fractions of the Weyl algebra on k (generated by u, v with 
uv ~ vu = 1), where char k = 0. Writing t = u~', verify that vt = t(v +t). Show 
that the t-adic valuation on E is quasi-commutative. 


to 


9.5 Pseudo-linear extensions 365 


9.5 Pseudo-linear extensions 


We recall that (for any ring K) a K-ring A is essentially a ring A with a homo- 
morphism K — A. If K is a field, this means that A contains a copy of K except 
when A = 0. Every K-ring A, for a field K, may be regarded as a left or right 
K-module; we shall denote the corresponding dimensions by [A : K], and [A : K Jz 
and note that they need not be equal, even when A is itself a field. Below we shall give 
an example of a field extension in which the right dimension is two and the left 
dimension is infinite (Artin’s problem, see Cohn [1961]). For examples in which 
both dimensions are finite but different see Schofield [1985] or also Cohn (1995). 

We note the product formula for dimensions, familiar from the commutative case. 
The proof is essentially the same as in that case (BA, Proposition 7.1.2) and so will 
not be repeated. 


Proposition 9.5.1. Let D C E be skew fields and V a left E-space. Then 
[Ve VB ED lp, 


whenever either side 1s finite. oi 


There are a number of cases where the left and right dimensions of a field exten- 
sion are equal. In the first case equality holds for an extension E/D if EF is finite- 
dimensional over its centre. 


Proposition 9.5.2. Let E/D be a skew field extension and assume that EF 1s finite- 
dimensional over its centre. Then 


ea oes eel (9.5.1) 
whenever either side is finite. 


Proof. By hypothesis E is finite-dimensional over its centre C. Write A = DC = 
{Sox yilx; € Div; € C}; then A is a subring containing C in its centre, hence a 
C-algebra and also a D-ring. Since it is generated by C over D, we may choose a basis 
of A, as left D-space, consisting of elements of C. This is also a basis of A as right 
D-space, because C is the centre of E, so 


Now A is a subalgebra of the division algebra E, so A is also a skew freld and by 
Proposition 9.5.1, 


[E:C] =[E:A],[A:C] =[E: Alp[A: C]. (9.5.3) 


Since [E: C] is finite, so is [A:C]; dividing (9.5.3) by [A:C], we find that 
[E: A]; = [E: Alp. If we now multiply by (9.5.2) and use Proposition 9.5.1 and its 
left-right dual, we obtain the required formula (9.5.1). Gi 


We also have equality when D is commutative: 


366 Skew fields 


Proposition 9.5.3. Let E be a skew field and D a commutative subfield. Then 
ED) SEED pe. (9.5.4) 
whenever either side is finite. 


Proof. Suppose that [E : D], =n, and denote the centre of E by C. Then E® @c E 
is simple, by Corollary 5.1.3, so if M(E) denotes the multiplication algebra of 
E (generated by the left and right multiplications), then the natural mapping 
E° @ E— M(E) is injective. Now E is an n-dimensional left D-space, so 
[End(,£) : D] = n° and by restriction we obtain an injective mapping D @: E > 
End(;£). It follows that [E: C] = |D@- E: D] < mr, and now (9.5.4) is a conse- 
quence of Proposition 9.5.2. Ea 


By combining these results we obtain 


Theorem 9.5.4. If E/D is a skew field extension, then (9.5.4) holds whenever either side 
is finite, provided that either (i) D is commutative or (1i) E or D is finite-dimensional 
over its centre. 


Proof. It only remains to treat the case where D is finite-dimensional over its centre. 
Let Z be this centre, and denote the centre of E by C; further assume that [E : D], is 
finite, so 


and this is also finite. Denote by K the subfield generated by C and Z; clearly K 


is commutative and [E:Z], =[E:K],;[K:Z],;, so [E: K], is finite, hence by 
Proposition 9.5.3, [E:K], = [E:K]py; now [K:Z], =[K:Z]p because K is 


commutative, and by combining these equalities, we find that [E: Z|], = |E: Z],. 
Now (9.5.4) follows from this equation, combined with (9.5.5) and its right-hand 
analogue. oO 


The study of finite-dimensional field extensions is greatly complicated by the lack 
of commutativity. Let us consider the simplest case, of a quadratic extension E/D, 
[E: D]p = 2. For any u € E\D the pair 1, u is a right D-basis of E. Thus every 
element of FE can be uniquely expressed in the form ua +b, where a,b € D. In 
particular, we have 


cu=uc*+c° forall ce D. (9.5.6) 
and 
w+ uA + 42= 0 forcertain A. € D. (9.5.7) 


Here c*. c° are uniquely determined by ¢ and a calculation as in Section 7.3 shows a 
to be an endomorphism of D and 6 an a-derivation. Moreover, the structure of E is 
completely determined by D and (9.5.6), (9.5.7). 

Conversely, if D is any field with an endomorphism a and an q@-derivation 4, then 
for given A. jt € D it is possible to write down necessary and sufficient conditions for 
a quadratic extension of D to be defined by (9.5.6) and (9.5.7) (see Exercise 3). 


9.5 Pseudo-linear extensions 367 


Generalizing the above discussion, we may define a pseudo-linear exterision of right 
dimension n, with generator u, as an extension field E of D with right D-basis 
Lill ston u"~' such that (9.5.6) holds, and in place of (9.5.7), 


wu +a" 'Ay +... +A, =0 for certain A; € D. (9.5.8) 


What has been said shows that every extension of right dimension 2 is pseudo-linear; 
in higher dimensions the pseudo-linear extensions form a special class. We note the 
following formula for the left dimension: 


Proposition 9.5.5. Let E/D be a pseudo-linear extension of right dimension n, with 
commutation formula (9.5.6). Then 


oe baa ea) Oe 9 ae en Be Balle Comers onl Lee Ones beac (9.5.9) 
In particular, any pseudo-linear extension E/D satisfies 
[E:D], > [E: Dp. 
with equality if and only if a@ is an automorphism of D. 


Proof. Take a generator u of E/D and write Ey = D, FE, = uE,-, + D(i = 1). Then by 
induction on r we have 


E,=D+uDd+...+u'D. 


Moreover, each E, is a left D-module, by (9.5.6), so we have a tower of left 
D-modules 


PP Boe, TG, 2.3 bye 4 3k, 
and (9.5.9) will follow if we prove 
|E,/E,-;:D], =|D:D*J}. (9.5.10) 
Let {e,|A € I} be a left D-basis for D. We claim that the elements 
wet es bee (9.5.11) 


where Ay. Aji... 4,1 range independently over I, form a basis of E, (mod E, - ;). 
This will prove (9.5.10) and hence (9.5.9). 

Any c € D can be written as a linear combination of the e;, with coefficients in D®, 
say c= ) oc e,. If we repeat the process on ¢,, we obtain c,, = )) cf, €x,. Hence 


= | OF 20 
c= Chay ene ten 


and after r steps we find 


368 Skew fields 


Therefore 
Poo row u'- } 
Ud C == u Oe E FA, , a Ei. 
Eee r.a—] 
= ) Cay NU, * eee (mod E,_,). 


Hence the elements (9.5.11) span E, (mod E,_,). To prove their linear indepen- 
dence, assume that 


) C0, Wes ~l ..e,, =0 (mod £,.. ,) 


yw 1 ety =O (mod E,.)). 


Since the e, are left linearly independent over D“, we can equate the coefficients of 
e,,, to 0 and using induction on r we find that c,,,, , =0. i 


In order to obtain a quadratic extension satisfying (9.5.6) and (9.5.7) let us assume 
that A = 0, ad + da = 0, and that 5° is the inner a@~-derivation induced by —y. 
Further assume that w* = y4,4° = 0 and that D contains no element a satisfying 


aa’ +a? tu =0. (9.5.12) 


We form the skew polynomial ring R = D[t: a, 6] and consider f = t° + ,. For 
any c € D we have 


of =c(t° ty) = (te*+c°)tt+ cp 


= tc" ek: te” at tc°@ an co ae an 


2 : : a- ae 
fe" Bye” =e" Fe 


=fe™. 


Hence fR is a two-sided ideal. Moreover, fis irreducible, for if f could be factorized, 
we would have a product of two linear factors which may both be taken monic, with- 
out loss of generality. Then 


Pap Sateawe Sb) at Sa Se) Sa ae 


Therefore b = —a*% and aa“ + a° + ft = 0, but this contradicts the fact that (9.5.12) 
has no solution in D. It follows that E = R/fR is an integral domain of right dimen- 
sion 2 over D, hence a field. It has the right D-basis 1, u, where u is the residue class 
of t, while the left dimension is 1 + {D: D“],, and this will be greater than 2 pro- 
vided that a is not an automorphism. 

As an example let us take any commutative field k, put E = k((x, y’)), the free field 
on x and y over k, take D to be the subfield generated over k by x, y~ and xy — yx". 
On E we have a k-linear endomorphism @ : x!-> x", y!> — y. To show this one has 
either to verify directly that @ as endomorphism of k(x, y) is honest, or show that E 
admits an extension in which x has a square root. For then, by iterating the process 


9.5 Pseudo-linear extensions 369 


one obtains a field P containing E in which x has a 2”-th root for all 1 > 0, so the 
mapping xix", yi» —y is an automorphism of P and the restriction to E is 
the required endomorphism. This is fairly plausible, but we shall not give a formal 
proof here (see Cohn (1995), Section 5.9). 

It is easily checked that a maps D into itself; moreover D admits the inner a- 
derivation 6 : ai— ay — ya" induced by y. If we can show that y ¢ D and that @ is not 
an automorphism of D, we have a quadratic extension E/D with left dimension > 2; 
in fact we shall find that [E: D]; = 00. 

Consider the (a, 1)-derivation y on E such that x* = 0, vy” = 1. We have (ab)? = 
a”b+a*b’, by definition, hence x =0, (y*)*» =yty* =0, (xy — yx*)” 
x" —x° = 0. Hence y vanishes on D, but y” = 1, so v ¢ D. 

Finally to show that [E: D], = 00, we first note that if [D: D“], =n, then 
|E: E*|, is finite. For let w)..... u, be a left D“%-basis of D. We claim that 
Uj, W,U,.u,~u; span E as left E“-space. Any p € E has the form p=a-+ yb, where 
a,be D, say a=) oatu, b= >°bYujy hence p= >i atuj ty). bfuj = >) atut 
>- byu; — So beu, = Vatut>d biuiyu, — d_chuju;, for suitable b,,, cj € D. This 
proves our claim; in particular, it shows that [E: E“], < 2° +1, if [D: D®]; =n. 
Now E*% = k((x". y)) and so the elements xy’ (r = 1.2,...) are left E%-linearly inde- 
pendent. This is intuitively plausible and can be proved with the methods of Cohn 
(1977), Lemma 5.5.5. It follows that [E: E*%],; = oo and by Proposition 9.5.5, 
[ED |p = So: 


Exercises 


1. Show that every cyclic division algebra may be described as a pseudo-linear exten- 
sion of a maximal commutative subfield. 
. Use the methods of this section to construct a field extension of right dimension 1 
and infinite left dimension. 
3. Show that (9.5.6) and (9.5.7) define an algebra of right dimension 2 over D iff 
cH oh =he™ —CML ce +650 = we” —ct,V=u- pt —-—AA—-AX), w= 
yu(A*% — A), and this extension is a fleld iff c.c* +cA + uw £0 for all c € D. 


tu 


Further exercises on Chapter 9 


1. Let D bea field with centre C and ka subfield of C such that C is algebraic over k. 
Show that if D is finite-dimensional over C and finitely generated as k-algebra, 
then {[D: k]} is finite. 

. Show that over a local ring any matrix A can be written as A = LUP, where L is 
lower unitriangular, U is upper triangular and P is a permutation matrix. 


to 


a, Lec Ps= ( 2 a) be a square matrix written in block form, where A is invertible. 


Show that 6(P) = 6(A).6(D — CA~'B). 

4. Show that the core of a matrix A (over a field) has diagonal form iff each principal 
minor of A is invertible, and when this is so, then the factors L and U in (9.2.8) (as 
well as M) are unique. 


370 Skew fields 


5. Show that in a skew field of characteristic not 2, [(x+ty-— Dine 
(x+y +2) ')—[(x-—y—2)>' = (x —y+2)7 1] = 1/2(xy + yx). 

6. Let A be a full matrix over k(X). Show that there exists » = n(A) such that for 
every central division k-algebra D of degree at least n, A is non-singular for 
some set of values of X in A. 

7. Let K be a field with a valuation v. Given any matrix A over K, show that 
A = PLDUQ, where P, Q are signed permutation matrices, L is lower and U 
upper unitriangular and D = diag(d,,...,d,) is a diagonal matrix with 1(d,) < 
v(d2) <.... Show that if A is square and v is abelian, then v(6(A)) = >> v(dj). 

8. Show that in a quadratic extension E/D, subject to (9.5.6) and (9.5.7), @ may be 
extended to an endomorphism q@’ of E by u® = —u—A, and 4 is then the inner 
a@’-derivation induced by u. Verify that a’ is an automorphism of E iff @ is an 
automorphism of D. 


Coding theory 


The theory of error-correcting codes deals with the design of codes which will detect, 
and if possible correct, any errors that occur in transmission. Codes should be dis- 
tinguished from cyphers, which form the subject of cryptography. The subject of 
codes dates from Claude Shannon’s classic paper on information theory (Shannon 
[1948]) and Section 10.1 provides a sketch of the background, leading up to the 
statement (but no proof) of Shannon’s theorem. Most of the codes dealt with are 
block codes which are described in Section 10.2, with a more detailed account of 
special cases in Sections 10.3—10.5; much of this is an application of the theory of 
finite fields (see BA, Section 7.8). 


10.1 The transmission of information 


Coding theory is concerned with the transmission of information. For example, 
when a spacecraft wishes to send pictures back to Earth, these pictures are converted 
into electrical impulses, essentially representing strings of 0’s and 1’s, which are 
transmitted back to Earth, but the message sent may be distorted by ‘noise’ in 
space, and one has to build in some redundancy to overcome these errors (provided 
they are not too numerous). An important everyday application is to digital record- 
ing and transmission, for example the compact disc, on which a piece of music 1s 
stored by means of tiny pits representing 0’s and 1’s according to a code. 

To transmit our messages we need an alphabet Q consisting of q symbols, where 
q => 2; we also speak of a q-ary code. A finite string of letters from Q is called a word. 
The information to be transmitted is encoded by words of Q; during transmission 
the coded message may be slightly changed (due to a ‘noisy’ channel) and, as a 
result, a slightly different message is received. However, if the code is appropriately 
chosen, it is nevertheless possible to decode the result so as to recover the original 
message. 


coding transmission decoding 


message | coded message }— received message decoded message 


372 Coding theory 


Many of our codes are binary: g = 2. For example, in the game of ‘twenty ques- 
tions’ an object has to be guessed by asking 20 questions which can be answered ‘yes’ 
or ‘no’. This allows one to pick out one object in a million (since 2-° ~ 10°). Usually 
a binary code will have the alphabet {0.1}; our coded message will then be a string 
of 0’s and 1’s. As a simple check we can add 1 when the number of 1’s in the message 
is odd and 0 when it is even. If the received message contains seven 1’s we know that 
a mistake has occurred and we can ask for the message to be repeated (if this is 
possible). This is a parity check; it will show us when an odd number of errors 
occurs, but it does not enable us to correct errors, as is possible by means of 
more elaborate checks. Before describing ways of doing this we shall briefly discuss 
the question of information content, although strictly speaking this falls outside our 
topic. The rest of this section will not be used in the sequel and can be omitted with- 
out loss of continuity. 

It is intuitively clear that the probability of error can be made arbitrarily small by 
adding sufficiently many checks to our message, and one might think that this will 
make the transmission rate also quite small. However, a remarkable theorem due to 
Shannon asserts that every transmission channel has a capacity C, usually a positive 
number, and for any transmission rate less than C the probability of error can be 
made arbitrarily small. Let us briefly explain these terms. 

The information content of a message is determined by the likelihood of the event 
it describes. Thus a message describing a highly probable event (e.g, ‘the cat is on the 
mat’) has a low information content, while for an unlikely message (‘the cow jumped 
over the moon’) the information content is large. If the probability of the event 
described by the message is p, where 0 < p < 1, we shall assign as a measure of 
information —log, p. Here the minus sign is included to make the information 
positive and the logarithm is chosen to ensure that the resulting function is additive. 
If independent messages occur with probabilities p,, p. then the probability that both 
occur Is p;p> and here the information content is 


—log p;p2 = —log p, — log po. 


All logs are taken to the base 2 and the unit of information is the bit (binary digit). 
Thus if we use a binary code and 0, | are equally likely, then each digit carries the 
information — log (1/2) = 1, i.e. one bit of information. 

Suppose we have a channel transmitting our binary code in which a given message, 
consisting of blocks of k bits, is encoded into blocks of 1 bits; the information rate of 
this system is defined as 


R=k/n. (10.1.1) 


We assume further that the probability of error, x say, is the same for each digit; this 
is the binary symmetric channel. When an error occurs, the amount of information 
lost is — log x, so on average the information lost per digit is —x. log x. But when no 
error occurs, there is also some loss of information (because we do not know that no 
error occurred); this is — log (1 — x). The total amount of information lost per digit 
is therefore 


H(x) = —x. log x — (1 — x). log (1 — x). 


10.2 Block codes 373 


This is also called the entropy, e.g. H(0.1) = 0.469, H(0.01) = 0.0808. The channel 
capacity, in bits per digit, is the amount of information passed, i.e. 


Thus C(0.1) = 0.531, C(0.01) = 0.9192. We note that C(0) = 1; this means that for 
x = 0 there is no loss of information. By contrast, C(1/2) = 0; thus when there is an 
even chance of error, no information can be sent. The fundamental theorem of 
coding theory, proved by Shannon in 1948, states that for any 6, € > 0, there exist 
codes with information rate R greater than C(x) — ¢, for which the probability of 
error is less than 6. In other words, information flows through the channel at 
nearly the rate C(x) with a probability of error that can be made arbitrarily small. 
Here the information rate of the code is represented by (10.1.1). More generally, 
if there are M different code words, all of length n, then R= (log M)/n. For a 
binary code there are 2* different words of length k, so log M =k and the rate 
reduces to k/n. 


Exercises 


1. How many questions need to be asked to determine one object in 10” if the reply 
is one of three alternatives? 

2. In the binary symmetric channel with probability of error 1, no information is 
lost, 1.e. C(1) = 1. How is this to be interpreted? 

3. If n symbols are transmitted and the probability of error in each of them is x, 


i 
show that the probability of exactly k errors is Cy si ne amas 


10.2 Block codes 


Most of our codes in this chapter will be block codes, that is codes in which all code 
words have the same number of letters. This number, n say, is called the length of the 
code. Thus a block code of length m in an alphabet Q may be thought of as a 
sequence of words chosen from Q”. For any x. vy € Q” we define the distance (also 
called the Hamming distance) between x and jy, written d(x, y), as the number of 
positions in which x and y differ, e.g. d(pea, pod) = 2. We note that this function 
satisfies the usual axioms of a metric space: 


M.! d(x, v) > 0 with equality iff x = y, 
M.2 d(x. y) = dy. x), 
M.3 d(x. yv) +d(y.z) => d(x, z) (triangle inequality). 


M.1—M.2 are clear and M.3 follows because if y differs in r places from x and in s 
places from z, then x and z can differ in at most r+ s places. 

Let C be a block code which is q-ary of length n. The least distance d = d(C) 
between code words of C is called the minimum distance of C. If the number of 
code words in C is M, then C is called a q-ary (1, M, d)-code, or an (n.* .d)-code 


374 Coding theory 


if we do not wish to specify M. For successful decoding we have to ensure that the 
code words are not too close together; if d(x. y) is large, this means that x and y differ 
in many of the n places, and x is unlikely to suffer so many changes in transmission 
that y is received. Thus our aim will be to find codes for which d is large. Our first 
result tells us how a large value of d allows us to detect and correct errors. A code is 
said to be r-error-detecting (correcting) if for any word differing from a code word u 
in at most r places we can tell that an error has occurred (resp. find the correct code 
word 14). 

We shall define the r-sphere about a word x € Q as the sphere of radius r with 
centre x: 


BAx)={y € Q"|d(x,y) <r}. (10.2.1) 


Clearly it represents the set of all words differing from x in at most r places. 


Proposition 10.2.1.A code with minimum distance d can (1) detect up to d — | errors, 
and (ii) correct up to [(d — 1)/2] errors. 
Here [&| denotes the greatest integer < &. 


Proof. (i) If x is a code word and s errors occur in transmission, then the received 
word x’ will be such that d(x.x') =s. Hence if 0 <s <d.x’ cannot be a code 
word and it will be noticed that an error has occurred. 

(ii) Ife < [(d — 1)/2], then 2e + 1 < d and it follows that the e-spheres about the 
different cade words are disjoint. For if x, y are code words and u € B(x) M BUY), 
then 


2e > d(x,u)+d(u.v) > d(x.v) => d>2e+ 1. 


which is a contradiction. Thus for any word differing from a code word x in at most 
e places there is a unique nearest code word, namely x. > | 


For example, the parity check mentioned in Section 10.1 has minimum distance 2 
and it will detect single errors, but will not correct errors. 

Proposition 10.2.1 puts limits on the number of code wards in an error-correcting 
code. Given n, d, we denote by A(n. d) or A, (n.d) the largest number of code words 
in a q-ary code of length n and minimum distance d; thus A, (n.d) is the largest M 
for which a q-ary (n. M, d)-code exists. A code for which this maximum is attained 
is also called an optimal code; any optimal (nm. M. d)-code is necessarily maximal, i.e. 
it is not contained in an (n, M + 1, d)-code. 

To obtain estimates for A,(m.d) we need a formula for the number of elements in 
B,(x). This number depends on qg, n, r but not on x; it is usually denoted by V,,(1, r). 
To find its value let us count the number of words at distance 7 from x. These 
words differ from x in 1 places, and the values at these places can be any one of 


N 
q — | letters, so there are ( | Jeg — 1)’ ways of forming such words. If we do this for 


1=0,1,...,r and add the results, we obtain 


10.2 Block codes 375 


Vinwr) = V,(n,r) = 1+ (")a- Sia (3 )ia- IP +... 4 (")a- ee 
ea r 


(10.2.2) 


A set of spheres is said to cover Q or form a covering if every point of Q lies in at least 
one sphere. It is called a packing if every point of Q lies in at most one sphere, i.e. the 
spheres are non-overlapping. We note that for 0 < r < n, 


gq’ =V,lrir) < Vi (ier) < Vij(non) = q". 


Theorem 10.2.2. Given integers q > 2, n, d, put e = [{d —1)/2]. Then the number 
A,(n.d) of code words in an optimal (n,* .d)-code satisfies 


Proof. Let C be an optimal (, M, d)-code; then C is maximal, and it follows that no 
word in Q” has distance > d from all the words of C, for such a word would allow us 
to enlarge the code, and so increase M. Hence every word of Q” is within distance at 
most d— 1 of some word of C; thus the (d — 1)-spheres about the code words as 
centres cover Q” and so M.V(n,d—1) > q", which gives the first inequality in 
(10.2.3). 

On the other hand, we have 2e + 1 < d, so the spheres B.(x), as x runs over an 
(1. M,d)-code, are disjoint and hence form a packing of Q". Therefore 
M.V(n.e) < q", and the second inequality in (10.2.3) follows. o 


The above proof actually shows that a code with q"/V(n. d — 1) code words and 
minimum distance d can always be constructed; we shall not carry out the construc- 
tion yet, since we shall see in Section 10.3 that it can always be realized by a linear 
code. 

The first inequality in (10.2.3) is called the Gilbert~Varshamov bound, and the 
second is the sphere-packing or Hamming bound. A code is said to be perfect if, 
for some e > 1, the e-spheres with centres at the code words form both a packing 
and a covering of Q". Such a code is an (n,M.2e+1)-code for which 
M.V(n, e) = q", so it is certainly optimal. It is characterized by the property that 
every word of Q” is nearer to one code word than to any of the others. To give 
an example, any code consisting of a single code word, or of the whole of Q" is per- 
fect. For q = 2 and odd n (with alphabet {0, 1}) the binary repetition code {0", 1"} 1s 
also perfect. These are the trivial examples; we shall soon meet non-trivial ones. 

There are several ways of modifying a code to produce others, possibly with better 
properties. Methods of extending codes will be discussed in Section 10.3, when we 
come to linear codes. For the moment we observe that from any (n, M.d)-code C 
we obtain an (n—1.M,d')-code, where d'’=d or d— 1, by deleting the last 
symbol of each word. This is called puncturing the code C. If we consider all 
words of C ending in a given symbol and take this set of words with the last 


376 Coding theory 


symbol omitted, we obtain an (n — 1, M’, d')-code, where M’ < M and d’ > d. This 
is called shortening the code C. Of course we can also puncture or shorten a given 
code by operating on any position other than the last one. 


Exercises 


1. Show that A,(n, 1) = q",A,(n.n) = q. 
. Use the proof of Theorem 10.2.2 to construct an (n, q"/V,(n.d — 1), d)-code, for 
any q > 2, n and d. 
3. Show that there is a binary (8, 4, 5)-code, and that this is optimal. 
4. (The Singleton bound) Prove that A,(n,d) < qg"~4*!, (Hint. Take an optimal 
(n,M,d)-code and puncture it repeatedly; for a linear [n,d]-code this gives 
k<n—d+1,.) 


to 


10.3 Linear codes 


By a linear code one understands a code with a finite field F, as alphabet, such that 
the set of code words forms a subspace. A linear code which is a k-dimensional 
subspace of an n-dimensional space over F, will be called an [n. k|-code over F,,. 
Thus any [n, k]-code over F, is a linear (n, q‘,d)-code for some d. The number 
of non-zero entries of a vector x is called its weight w(x); hence for a linear code 
we can write 


a(x, y) = w(x — y). (10.3.1) 


A linear (n, M, d)-code has the advantage that to find d we need not check all the 
M(M — 1)/2 distances between code words but only the M weights. Moreover, to 
describe an {1. k]-code C we need not list all gq” code words but just give a basis 
of the subspace defining C. The k x m matrix whose rows form the basis vectors is 
called a generator matrix of C; clearly its rank is k. 

A matrix over a field is called left full if its rows are linearly independent, right full 
if its columns are linearly independent. For a square matrix these concepts are of 
course equivalent and we then speak of a full matrix. Thus a generator matrix of a 
linear code is left full, and any left full k x n matrix over F, forms a generator 
matrix of an [1. k]-code. 

Let C be an [n, k}-code with generator matrix G. Any code word is a linear com- 
bination of the rows of G, since these rows form a basis for the code. Thus to encode 
a message u € FE we multiply it by G: 


ui uG. 


1 0 1 
For example, for the simple parity check code with generator matrix e ,) 


this takes the form 


(14). U>) I> (14), Uo, Uy + Ud). 


10.3 Linear codes 377 


For any [n, k]-code C we define the dual code C+ as 
Gre tye F*\xy’ = 0 forall x € C}. 


Since any x € C has the form x = uG(u € F*), the vectors y of C~ are obtained as 
the solutions of the system Gy! = 0. Here G has rank k, therefore C* is an (n — k)- 
dimensional subspace of F7’; thus C * is an [n,n —k]-code. It is clear from the 
definition that C++ =C. When n is even, it may happen that C+ = C; in that 
case C is said to be self-dual. For example, the binary code with generator 


FE OO TX... 
matrix is self-dual. 
0 1 1 0 


Generally, if G, H are generator matrices for codes C, C+ that are mutually dual, 
we have 


GH! =0: (10.3.2) 


since G, H are both left full, it follows that the sum of their ranks is m and by the 
theory of linear equations (see e.g. Cohn (1994), Chapter 4), 


x=uG forsome ué F & xH' =0, (10.3.3) 
y = vH for some v € F, “§ 2 yG! =0. (10.3.4) 


A generator matrix H for C' is called a parity check matrix for C. For example, 
1 0 
| 
xH? = x, +x» + x3; if this is non-zero, an error has occurred. Our code detects one 
error, but no more (as we have already seen). Before introducing more elaborate 
codes we shall describe a normal form for the generator and parity check matrices. 

Two block codes of length n are said to be equivalent if one can be obtained from 
the other by permuting the n places of the code symbols and (in the case of a linear 
code) multiplying the symbols in a given place by a non-zero scalar. For a generator 
matrix of a linear code these operations amount to (1) permuting the columns and 
(ii) multiplying a column by a non-zero scalar. We can of course also change the 
basis and this will not affect the code. This amounts to performing elementary opera- 
tions on the rows of the generator matrix (and so may atfect the encoding rules). We 
recall that any matrix over a field may be reduced to the form 


[ £ 
( (10.3.5) 
0 0 


by elementary row operations and column permutations (see Cohn (1994), p. 59), 
and for a left full matrix the zero rows are of course absent. Hence we obtain 


] 
when c= ( ') then H=(1 1 1). For any code word x we form 


378 Coding theory 


Theorem 10.3.1. Any [1, k]-code is equivalent to a code with generator matrix of the 
form 


GSA uP), (10.3.6) 


where Pisak x (1 —k) matrix. P| 


It should be emphasized that whereas the row operations change merely the basis, 
the column operations may change the code (to an equivalent code). More precisely, 
an [n, k]-code has a generator matrix of the form (10.3.6) iff the first k columns of its 
generator matrix are linearly independent. This condition is satisfied in most prac- 
tical cases. If we use a generator matrix G in the standard form (10.3.6) for encoding, 
the code word uG will consist of the message symbols u)..... 1, followed by n ~k 
check symbols. 

The standard form (10.3.6) for the generator matrix of C makes it easy to write 
down the parity check matrix; its standard form is 


H=(—P!' I,_,). (10.3.7) 


For we have GH! = P — P = 0, and since H is left full, H’ is right full, thus of rank 
n — k, and it follows that the rows of H form a basis for the dual code C+. 

The process of decoding just consists in finding the code word nearest to the 
received word. Let us see how the parity check matrix may be used here. We have 
an [n,k]-code C with parity check matrix H. For any vector x € F’ the vector 
xH! ¢€ e Fi" is called a syndrome of x. By (10.3.3), the syndrome of x is 0 precisely 
when x € C. More generally, two vectors x, x’ € F’ are in the same coset of C iff 
xH ' = x’'H', Thus the syndrome determines the coset: if a vector x in C is trans- 
mitted and the received word y is x +e, then y and e have the same syndrome. 
To ‘decode’ 1, i.e. to find the nearest code word, we choose a vector f of minimum 
weight in the coset of C containing y and then replace y by )' — f. Such a vector f of 
minimum weight in its coset need not be unique; we choose one such f in each coset 
and call it the coset leader. The process described above is called syndrome decoding. 

For example consider a binary [4, 3]-code with generator matrix 


I. 020° 
G-/]0 10 1 
001 1] 


To encode a vector we have 
(14). Wo. 3) I> (1d). Ma. 3. My Oe + 4g). 


The parity check matrix is H=(1 1 1 1). The possible syndromes are 0 and 1. 
We arrange the 16 vectors of F) as a 2 x 8 array with cosets as rows, headed by 
the syndrome and coset leaders: 

0 0000 0001 0101 0011 1100 1010 0110 41111 

} 1000 0001 41101 41011 O100 0010 1110 O11! 


10.3 Linear codes 379 


This is called a standard array. To decode x we form its syndrome x; + x. + x3 +4. 
If this is 0, we can take the first three coordinates as our answer. If it is 1, we subtract 
the coset leader 1000 before taking the first three coordinates. We note that in this 
case there are four possible coset leaders for the syndrome 1; this suggests that the 
code is not very effective; in fact d = 2, so the code is 1-error-detecting. 

Next take the |4, 2]-code with generator and parity check matrices 


1 0 1 1 1 0 1 O 
a ace 
0 1 0 1] I sh: “Od 


A standard array is 
00 0000 1011 O101 41110 
01 O100 1111 OOO} 1010 
10 0010 +1001 O11] #21100 
11 1000 0011 #41101 + #£4O110 


To decode x = (1101) we form xH! = (11) and then subtract from x the coset 
leader for the syndrome (11), giving (1101) — (1000) = (0101). The minimum 
distance is 2, so the code again detects single errors. 

It is not necessary for decoding to write down the complete standard array, but 
merely the first column, consisting of the coset leaders. We note that this method 
of decoding assumes that all errors are equally likely, i.e. that we have a symmetric 
channel. In more general cases one has to modify the weight function by taking the 
probability of error into account; we shall not enter into the details. 

We now turn to the construction of linear codes with a large value of M for given 
n and d. By the Gilbert-Varshamov bound in Theorem 10.2.2 we have A,(n, d) > aS 
provided that Gi > V,(n,d — 1). However, this result does not guarantee the 
construction of linear codes. In fact we can construct linear codes with a rather 
better bound, as the next result shows. 


Theorem 10.3.2. There exists an |[n. k |-code over F, with minimus distance at least d, 
provided that 


V,(n —1,d—2) <q""*. (10.3.8) 


For comparison we note that Theorem 10.2.2 gives V,(n.d—1) <q" ‘so (10.3.8) 
is a weaker condition. 


Proof. Let C be any [n, k]-code over F, with parity check matrix H. Each vector x 
in C satisfies xH' = 0; this equation means that the entries of x define a linear 
dependence between the rows of H/, i.e. the columns of H. We require a code for 
which the minimum distance is at least d, i.e. no vector of C has weight less than 
d; this will follow if no d — 1 columns of H are linearly dependent. 

To construct such a matrix H we need only choose successively 1 vectors in "~ F, 
such that none is a linear combination of d — 2 of the preceding ones. In choosing 
the r-th column we have to avoid the vectors that are linear combinations of at most 


380 Coding theory 


d — 2 of the preceding r — 1 columns. We count the vectors to be avoided by picking 
r—] ; 
6 <d-—2columns in ( ; ways and choosing the coefficients in (q — 1)° ways. 


Hence the number of vectors to be avoided is 
i+(7') + (">") 1+ (75) i 
, q 4 q ee q—2 q 


Thus we can adjoin an r-th column, provided that V(r —1.d—2) < q”-*. By 
(10.3.8) this holds for r= 0.1,...., so we can form the required parity check 
matrix H. This proves the existence of a code with the required properties, since it 
is completely determined by H. | 


Let us examine the case d = 3. If n, q are such that 
AS . (10.3.9) 


then (10.3.8) holds for d = 3, because V,, is an increasing function of its arguments; 
so when (10.3.9) holds, we can construct an [n, k]-code C with minimum distance 3. 
This means that the l-spheres about the code words are disjoint, and since by 
(10.3.9), q*.V,(n. 1) =q", it follows that our code is perfect. These codes are 
known as Hamming codes. The equation (10.3.9) in this case reduces to 
1+n(q—1)=q"~*, ie. 


a eS (10.3.10) 
qg— 1 
Thus a Hamming code has the property that any two columns of its parity check 
matrix are linearly independent. In any code with odd minimum distance 
d = 2e+ 1 every error pattern of weight at most e is the unique coset leader in its 
coset, because two vectors of weight < e have distance < 2e and so are in different 
cosets. For a perfect code all coset leaders are of this form. Thus in a Hamming 
code each vector of weight | is the unique vector of Jeast weight in its coset. Now 
the number of cosets is q"/q* =q"~*. Omitting the zero coset we see from 
(10.3.10) that we have just n(q — 1) non-zero cosets and these are represented by 
taking as coset leaders the m(q— 1) vectors of weight 1. This makes a Hamming 
code particularly easy to decode: given x € EF”, we calculate xH’. If x has a single 
error (which is all we can detect), then xH’ = yH’, where H; is the j-th column 
of H and y € F,. Now the error can be corrected by subtracting y from the j-th 
coordinate of x. 
The simplest non-trivial case is the binary [3. 1]-code with generator and parity 
check matrices 


10.3 Linear codes 381 


It consists in repeating each code word three times. The information rate is 
1/3 = 0.33. 

The next case of the Hamming code, the binary [7, 4]-code, is one of the best- 
known codes and one of the first to be discovered (in 1947). Its generator and 
parity check matrices are 


100 00 1 41 
0 1 1 1 1 :=0 0 
0 1 0 0 1 0 41 
G= » Ae ik BO e LO ar 0 
0 0 1 0 1 1 =0 
1 101 0 0 1 
0 0 01 1 


Here the information rate is 4/7 = 0.57. The minimum distance is 3, so the code will 
correct | and detect 2 errors. 
From any q-ary {n, k|-code C we can form another code 


C= | (#1. 820-2386 Dom) 


called the extension of C by parity check. If C is binary with odd minimum distance 
d, then C has minimum distance d +1 and its parity check matrix H is obtained 
from the of C by bordering it first with a zero column and then a row of 1’s. 
From an (n, M, d)-code we thus obtain an (n + 1,M,d-+ 1)-code, and we can get 
C back by puncturing C (in the last column). 

Theorem 10.3.2 implicitly gives a lower bound for d in terms of q, n, k, but it does 
not seem easy to make this explicit. However, we do have the following upper bound 
for d: 


rect 


Proposition 10.3.3 (Plotkin bound). Let C be a linear [n. k|-code over F,,. Then the 
minimum distance d of C satisfies 


— 19k! 
pee (10.3.11) 
oe | 


Proof. C contains q* — 1 non-zero vectors; their minimum weight is d, hence the 
sum of their weights is at least d(q* — 1). Consider the contribution made by the 
different components to this sum. If all the vectors in C have zero first component, 
this contribution is 0; otherwise we write C, for the subspace of vectors in C whose 
first component is zero. Then C/C; =F, and so |C| = q*~'. Thus there are 
q‘ — q*‘~} vectors with non-zero first component. In all there are » components, 
and their total contribution to the sum of the weights is at most n(q* ~ a ap 


Hence d(q‘ —l)< n(q‘ — ge !) and (10.3.11) follows. | | 


Sometimes a more precise measure than the minimum distance 1s needed. This 


382 Coding theory 


is provided by the weight enumerator of a code C, defined in terms of the weights of 
the code words as 


A(z) = S~z zl) =S°A,z!. 


wet 


where A; is the number of code words of weight i in C. A basic result, the Mac- 
Williams identity, relates the weight enumerator of a code to that of its dual. This 
is useful for finding the weight enumerator of an [n, k]-code when k is close to n, 
so that m — k is small. We begin with a lemma on characters in fields. Here we under- 
stand by a character on a field Fa homomorphism from the additive group of F to the 
multiplicative group of complex numbers, non-trivial if it takes values other than 1. 
By the duality of abelian groups (BA, Section 4.9) every finite field has non-trivial 
characters xy, and )° x(a) = 0 by orthogonality to the trivial character. 


Lemma 10.3.4. Let x be a non-trivial character of (F,.+) and define 


p= a y(uv! jz"). where u € Ee. (10.3.12) 
ve Fe 
Then for any code C, 
>> flu) all OU fe 01 bon (10.3.13) 
nEeEC 


where B(z) is the weight enumerator of the dual code C~ and |C| is the number of code 
words in C, 


Proof. We have 


itn = > a x(uv! yzw) = a wy x( uve 


nEeC nec ye Yi HEeG 


For y € C‘ the second sum on the right is |C|. If v ¢ C+, then uv! takes every value 
in F, the same number of times, say N times, and we have >° x(u!) = 
N >> x(a) =0, because x is non-trivial. Hence the right-hand side reduces to 
|C|.B(z). ee 


We can now derive a formula for the weight enumerator of the dual code. 


Theorem 10.3.5 (MacWilliams identity). Let C be an [n. k|-code over F, with weight 
enumerator A(z) and let B(z) be the weight enumerator of the dual code C*. Then 


. l-—2z , 
B(z) = q7' l —1|)z a( =). 10.3.34 
(z)=q “{1+(q )z| eres: ( ) 


Proof. Let us extend the weight to F, by treating it as a one-dimensional vector 
space; thus w(a) = 1 for aé€ E. and w(0) = 0. Next, defining uw as in (10.3.12), 
we have 


10.3 Linear codes 383 


fant 
a2 
ty 
=~ 
~—— 
o~< 
ae, 
= 
“ 
Se 


If 1; = 0, the sum in this expression is 1 + (q — 1)z, while for u, 4 0 it is 


1+2(S- x(a) = | 2, 


Hence we obtain 


f(x) =(1- 2) "Oty +(q- L)z}pr7 wd, 


Substituting into (10.3.13) and remembering that |C| = q‘, we obtain (10.3.14). I 


Sometimes it is more convenient to use A(z) in its homogeneous form, defined by 
A(x.) = A(yx7!)x" = SO Aix" ~'y'. Then (10.3.14) takes the form 


B(x. y) = q ‘A(x +(q—l)y.x—y). (10.3.15) 


To illustrate Theorem 10.3.5, consider the binary Hamming code C of length 
n= 2*—1 and dimension n —k over F:. Its dual code has as generator matrix 
the parity check matrix H of C, whose columns are all the non-zero vectors in F}. 
Hence any non-zero linear combination of the rows of H = (h,;) has the i-th 
coordinate | 


ajhy, ig Arh», a mete arhy; 


This vanishes for 2*~' — 1 columns (forming with 0 a (k — 1)-dimensional sub- 
space), and so is non-zero for the remaining 2*~' columns. Hence every non-zero 
vector in the dual code C+ has exactly 24-1! non-zero components, and so B(z) = 
Lt ng +072. By Theorem 10.3.5, the weight enumerator of the Hamming code is 


A(z) =2 “(fl e (q — ei + n{] ae (q - bz)" MC] ss 


The weight enumerator is used in computing probabilities of transmission error. 
In any code C, an error, changing a code word x to y = x + e, will be detected pro- 
vided that y is not a code word. Thus the error will go undetected if y € C or, equiva- 
lently, ife € C. If the channel is binary symmetric, with probability p of symbol error, 
then the probability of the error vector e being of weight i is p'(1 — p)"~'. Thus the 
probability P,,,(C) that an incorrect code word will be received is independent of the 
code word sent and is 


Pal) = Api — py" =a(—2- Jaa — py" 


384 Coding theory 


Exercises 


1. Show that a code C can correct t errors and detect a further s errors if 
d(C) > 2t4+s4+1. 

2. Let C be a block code of length n over an alphabet Q and let A € Q. Show that C is 
equivalent to a code which includes the word X". 

3. Show that for odd d, Ao(n.d) = Ao(n+ 1,d+ 1). (Hint. Use extension by parity 
check and puncturing.) 

4. Construct a table of syndromes and coset leaders for the ternary [4, 2]-Hamming 
code. 

5. Show that the binary |7. 4]-Hamming code extended by parity check is self-dual. 

6. Show that for linear codes A,(n, d) > q*, where k is the largest integer satisfying 
q*.V,(n —1,d—2) <q". 

7. Show that for an [n, k]-code the MacWilliams identity can be written 


i 1 fi n—-1 
A; = q*~". — 1)! B;,, forO<r<n4. 
»() vd. Kaa _ 


i=0 7= 0 


8. Verify that formula (10.3.14) is consistent with the corresponding formula for the 
dual code. (Hint. Use the form (10.3.15).) 


10.4 Cyclic codes 


A code C is said to be cyclic if the set of code words is unchanged by permuting the 
coordinates cyclically: ifc = (¢o.¢),..., C,-1) € C, then (c¢,- 1,0. C1. ---.€,-2) EC. 
We shall here assume all our cyclic codes to be linear. For cyclic codes it is 
convenient to number the coordinates from 0 to n — 1. We shall identify any code 


word c with the corresponding polynomial 


ai 
C(x) = co text... 4+¢,-)x" |, 


in the ring A,, = F,[x]/(x"~'). In this sense we can interpret any linear code as a 
subset of A, and the cyclic permutation corresponds to multiplication by x. Clearly 
a subspace of A,, admits multiplication by x iff it is an ideal, and this proves 


Theorem 10.4.1. A linear code in F ts cyclic if and only if it 1s an ideal in 


Ay = F, [x] /(x” = Ae 

We note that the ring A,, has q" elements; as a homomorphic image of F,{x] it isa 
principal ideal ring, but it is not an integral domain, since x" — 1 is reducible for 
n> 1. 


Henceforth we shall assume that (n,q) = 1. Then x" —1 splits into distinct 
irreducible factors over F, and hence A,, is a direct product of extension fields of 
F, (see BA, Corollary 11.7.4). By Theorem 10.4.1 every cyclic code over F, can be 
generated by a polynomial g. Here g can be taken to be a factor of x" — 1, since 
we can replace it by ug — v(x" — 1) without affecting the code. As a monic factor 


10.4 Cyclic codes 385 


of x" — 1 the generator of a cyclic code is uniquely determined. We record an expres- 
sion for the generator matrix in terms of the generator polynomial: 


Theorem 10.4.2. Let C be a cyclic code of length n with generator polynomial 


L=KLotaxt... + ¢,x" (oT), 


Then dim(C) = 1 — r and a generator matrix for C is given by 


§0 £1 soe gr 0) 0 ies 0 
G= 0 £0 tee Sr | Rr 0 ho 0 
0 0 oe Ro I 2 sexe “Rs 


Proof. We have g, = 1 and by considering the last r columns we see that G is left full. 
The m — r rows represent the code words g. xg,.... x"~""lg and we have to show 
that the linear combinations are just the code words. This is clear since the code 
words are of the form fg, where f is a polynomial of degree < n —r. + | 


We next derive a parity check matrix for the cyclic code C. This is done most easily 
in terms of an appropriate polynomial. Let C be a cyclic [n. k]-code with generator 
polynomial g. Then g is a divisor of x” — 1, so there is a unique polynomial h 


satisfying 

e(x)h(x) = x" — 1. (10.4.1) 
h is called the check polynomial of C. Clearly it is monic and its degree is 
n—degg=n-—(n—k)=k. A cyclic code is said to be maximal if its generator 
polynomial is irreducible; if its dual is a maximal cyclic code, it is called minimal 
or irreducible. It is now an easy matter to describe a parity check matrix; to simplify 


the notation we shall use = to indicate congruence mod(x" — 1), i.e. equality in the 
ring A,,. 


Theorem 10.4.3. Let C be a cyclic |n, k|-code with check polynomual 
hho thx t... + hyx* 

Then 

(i) c€Cifand only if ch=0, 


(ii) a parity check matrix for C is 


hy fy... Io 0 


a 0 hy ee 5 ho 


0 0 aes hy. hp hy 9 oe ho 


386 Coding theory 


(ili) the dual code C+ is cyclic, generated by the reciprocal of the check polynomial for C: 
h= hp + hp yx +... 4+ hx. 


Proof. (i) By definition, c € C iff c= ag. By (10.4.1) gh = 0, hence if c € C, then 
ch = agh = 0. Conversely, if ch = 0, then ch is divisible by x" — 1 = gh, hence c is 
divisible by g. 

(ii) On multiplying the i-th row of G by the j-th row of H, we obtain 


SoM i+; + gyhy - Pe ced + oe 14 jho. (10.4.2) 


and this vanishes, as the coefficient of x*~'*! in gh. Thus GH! = 0, and since H isa 
left full r x m matrix, it is indeed a parity check matrix for C. _ 

(111) By comparing the form of H with that for G, we see that h is a generator poly- 
nomial for C~. a 


We go on to describe how generator and check polynomials are used for coding 
and decoding a cyclic code. Let C be a cyclic [n, k]-code with generator polynomial 
g of degree r=n—k and check polynomial h of degree k. Given a message 
A= aAya)...a,_,€ E we regard this as a polynomial a = 5} ajx' of degree <k 
over F,,. We encode a by multiplying it by g and obtain a polynomial u = ag of 
degree < n. We note that any code word # 0 has degree at least r = deg g. 

For any polynomial f of degree < n we calculate its syndrome S(f ) by multiplying 
the coefficients of f by the rows of the parity check matrix H. The result is 


(fa) (Peers os Fn a1. (10.4.3) 


where for any polynomial gy in x, y; denotes the coefficient of x’. To represent 
(10.4.3) as a polynomial, we take the polynomial part of x “( fh), ignoring powers 
beyond x"~ '. This can also be achieved by reducing fi(mod x" — 1) to a polynomial 


of degree < n and then taking the quotient of the division by x’: 


fh=x'S(f)+p, where deg p < k. (10.4.4) 


Since deg f <n, the highest possible power in fh is x"**~!. When reduced this 
becomes x*~', and so does not affect the quotient in (10.4.4). Therefore S(f) is 
indeed the syndrome of f, and as before S(f) =0 precisely when f has the form 
ag. By reducing fh (mod x" — 1), we obtain a representative of degree < n; hence 
S(f) is of degree < n —k =r, as one would expect. 

Now we choose for each possible syndrome u a coset leader L(u) of least weight. 
To decode a word f we compute its syndrome S(f) and subtract the corresponding 
coset leader: f — LS(f ) is a code word, so we have (f — LS(f ))h = a(x" — 1), for 
some a, and this a is the required decoding of f. For example, x‘ —1 = 
(x —1)(x? + x° + 1)(x° +x+1) is a complete factorization over Fs. Let us take 
gaexit¢x4tl], h=x'4+x°4x4+1, so r=3, k= 4. Suppose we encode x* + x, 
obtaining the code word (x° + x)(x8 +x4+1) =x° +a1+2x° + x. Owing to errors 
in transmission this is received as x°+x'+x. We have x°+x7+x= 


10.4 Cyclic codes 387 


(x7 + x)(x? +x+1)+x", so the coset leader is x*, and adding this to the received 
word we get x +xt4+xet+x. Now (txttxi+x)(xi+x°+x4+1) = 
(x° +x)(x? +x+1), so our message is (correctly) decoded as x* + x. 

Sometimes it is useful to choose the generator for a cyclic code in a different way. 
In BA, Corollary 11.7.4 we saw that A,, = F,,[x]/(x" — 1) is a direct product of fields, 


say 
re x Jeo Ky 


where K, corresponds to the irreducible factor f; of x" — 1. The generator polynomial 
of a cyclic code C is the product of certain of the f,, say (in suitable numbering) 
Viseces f.. If e; denotes the unit element of K;, then e =e; +...+ e, is an element 
of A,, which is idempotent and which can also be used for the code. For the poly- 
nomials corresponding to the code words are the elements of K; x... x K, and 
these are the elements c € A,, such that c = ce. Thus every cyclic code has an idem- 
potent generator; of course this will in general no longer be a factor of x" — 1. To 
find the idempotent generator, suppose that C is a cyclic code with generator poly- 
nomial g and check polynomial h, so that gh = x" — 1. Since g, h are coprime, there 
exist polynomials u, v such that ug + vh = 1. It follows that ug is the idempotent 
generator, for we have (ug)” = ug(1 — vh) = ug. Thus in the above example the 
idempotent generator is xg = x’ +x" +x. 

For examples of cyclic codes consider again the binary [1, 1 — k ]-Hamming code. 
Its parity check matrix is k x n and its columns are all the non-zero vectors of *F:, 
for they are distinct; hence any two are linearly independent, and there are 
n = 2‘ — | of them. Let us write q = 2‘ and consider F, as a vector space over F); 
this is a k-dimensional space, so the columns of the above parity check matrix are 
represented by the non-zero elements of F,,. If these elements are @.....@)-1, we 
can regard 


(00... Qy 1) (10.4.5) 


as a parity check matrix for the Hamming code. This will correct one error, and it 
seems plausible that we can correct more errors by including further rows, indepen- 
dent of (10.4.5). 

To obtain such rows we recall that for any n distinct elements c),...,¢, Over a 
field, the Vandermonde matrix 


l ] i 

C| C2 Cy, 

d qy , 

V (GjacOie es cata = Cj ae yee: ee 
ree gi es cin! 


tl 


is non-singular; as is wel] known (and easily checked) its determinant is 


I, =] (¢; _ cj). 


388 Coding theory 


Theorem 10.4.4. Let g = 2" and denote by a,.....a@,,—, the non-zero elements of F,. 
q y a | q 


Then for any integer t < q/2 there is a [q, k |-code over F> with k > q — mt and mini- 
mum distance at least 2t+ 1 and with parity check matrix 


QO) oe) Ay — | 
3 3 3 
Qt; a5 Xi] 
H= om a Be (10.4.6) 
yy Vp Yr - 
a a! sen 1 


Proof. A vector c € F/ is a code word iff cH’ = 0, ice. 


ee 0 forgo (10.4.7) 


} 


On squaring the j-th equation we find (> eo.) = Scar’, because a, € F, and 
c; € F,. Hence (10.4.7) holds for all j = 1, 2..... 2t; if we insert the corresponding 
rows in the matrix (10.4.6), we see that the square matrix formed by the first 2r 
columns has determinant a@,...a>,V({q@ ,....a@>,), and this is non-zero since 
O.a)..... @>, are distinct. Thus the first 2t columns of the new matrix are linearly 
independent, and similarly for any other set of 2t columns. This means that none 
of the vectors c in (10.4.7) can have weight < 2t, so the minimum distance of our 
code is at least 2t + 1. Ps 


The binary code with parity check matrix (10.4.6) is called a BCH-code, after its 
discoverers R. C. Bose, D. K. Ray-Chaudhuri and A. Hocquenghem. 


Exercises 


1. The zero code of length n is the subspace 0 of F”’. Find its generator polynomial (as 

cyclic code) and describe its dual, the universal code. 

. The repetition code is the [n, 1]-code consisting of all code words (y, y,..-. y), 

y € F,. Find its generator polynomial and describe its dual, the zero-sum code. 

3. Describe the cyclic code with generator polynomial x + 1 and its dual. 

4. Verity that the {7, 4]-Hamming code is cyclic and find its generator polynomial. 

5. Show that the [8.4]-code obtained by extending the [7,4]-Hamming code by 
parity check is self-dual and has weight enumerator 2" + 142? + 1. 

6. Show that a binary cyclic code contains a vector of odd weight iff x — 1 does not 
divide the generator polynomial. Deduce that such a code contains the repetition 
code. 

7. The weight of a polynomial f is defined as the number w(f) of its non-zero 
coefficients. Show that for polynomials over F;, w( fg) < w(f )w(g). 


to 


10.5 Other codes 389 


10.5 Other codes 


There are many other codes adapted to various purposes, and it is neither possible 
nor appropriate to include all the details here, but it may be of interest to make a 
brief mention of some of them. 

(i) Goppa codes. In (10.4.7) we can take the elements of F, to be a, Dis ate tO es 
Then the defining equations for the BCH-code take the form }°,c¢j@;° = 


(Epa 2a. 2t), or equivalently, i, cjxla ’ = 0. This can also be written 


> ig. (oa (10.5.1) 


xX — Q; 


and it leads to the following generalization. 


Definition. Let g be a polynomial of degree t over F,. and let a. .... Q@,—-1 € Fy». be 
such that g(@;) #0 (1=0,..., n—1). The Goppa code with Goppa polynomial g is 
defined as the set of all vectors c = (cg. c)..... C,- 1) satisfying 

y —— =0 (mod g(x). (10.5.2) 


re toad 


We see that Goppa codes are linear but not necessarily cyclic. In order to find a parity 
check matrix for the Goppa code we recall that in the special case of the BCH-code 
this was obtained by writing 


(x —a,)- b=) xa; is sate =): 


The coefficients of x/ (taken mod x*') form the entries of the (j + 1)-th row in the 
parity check matrix. We have 


| 9g) g(x) — g(a) 


x-a_ AO 


(mod g(x)), (10.5.3) 


and here the right-hand side is a polynomial in x; it is therefore the unique polyno- 
mial congruent (mod g) to (x — ay 2 in once! to express (10.5.2) as a polynomial 
in x we can proceed as follows: if g(x) = >¢ g,x', then 


g(x) — gly) ; 
x—y rea. 


further write gla;) '—h,, Then (10.5.2) becomes >= hij = 0, where hjj = 
h, >> gy 4j;+1x’a}. Thus the matrix (h;,) is 


hog, hig see hy, — 21 
hol 1 + £:Qo) ce ce Nat Spar £1) 


ho(g; + 220 +... + ga) ') eer Sndeod Nigesi (Qi + £2Ay — | +. - + 2a at") 


390 Coding theory 


By elementary row transformations (remembering that g, 4 0) we find that 


ho hy oe Ny| 
hy hia, dniate h,, me oe 
H-= 
} t-] h t-- | h Pa] 
19 @q pay pay n-1@,,_ | 


We see again that any ft columns are linearly independent, hence the Goppa code has 
minimum distance > t and its dimension is > m — mit. There are several methods of 
decoding Goppa codes, based on the Euclidean algorithm (see McEliece (1977)) and 
the work of Ramanujan (see Hill (1985). 

(ii) Let C be any block code of length n. We obtain another code, possibly the 
same, by permuting the n places in any way. The permutations which do not 
change C form a subgroup of Sym,, the group of C, which may be denoted by 
G(C). For example, the group of a cyclic code contains all translations :1>i+r 
(mod n). If s is prime to n, we have the permutation 


Ul, 1 1i-> s1(mod n). (10.5.4) 


We remark that yz. is an automorphism of A, = F,,[x]/(x" — 1). For if a(x) = 
> aj;x', then au, = >) a;x" = a(x*), and the operation a(x)!» a(x*) is clearly an 
endomorphism; since s is prime to n, yu. has finite order dividing y(m) and so 1s 
an automorphism. 

A QR-code is a cyclic code of length n, an odd prime, which admits the permu- 
tation (10.5.4) of its places, where s is a quadratic residue mod n. We shall examine 
a particular case of QR-codes, following essentially van Lint (1982). Let n again be an 
odd prime and q a prime power such that g is a non-zero quadratic residue mod 11. 
We write Q for the set of all non-zero quadratic residues mod n, N for the set of all 
quadratic non-residues and let @ be a primitive n-th root of 1 in an extension of F,. 
Write E = F,,(q@) and put 


Then 
x" = 1 = (x — 1)go(x)g) (x). 


The Galois group of E/F,, is generated by the map x1-> x‘. Since q € Q, this opera- 
tion permutes the zeros of go as well as those of g,; therefore g, and g, have their 
coefficients in F,. We note that yu, interchanges g, and g, if sé N, hence the 
codes with generators gy and g, are equivalent. We shall be particularly interested 
in the QR-code generated by go; our aim will be to find restrictions on the maximum 
distance d. We recall from number theory (see e.g. BA, Further Exercise 24 of 
Chapter 7) that 2 is a quadratic residue mod n iff » = +1 (mod 8), and that —1 is 
a quadratic residue mod n iff n = 1 (mod 4). We shall also need a lemma on weights; 
for a polynomial f (regarded as a code word) the weight w(f) is of course just the 
number of non-zero coefficients. 


10.5 Other codes 391 


Lemma 10.5.1 Let f be a polynomial over F, such that f(1) #0. Then (1 +x + 
..+x""!)f has weight at least n. If q=2 and deg f <n, then (1+x+...4+ 
x""!)F has weight exactly n. 


Proof. By the division algorithm, f = (x — 1)u-+c, where c = f(1) #0. It follows 
that 


(Q+xt...4x"° Sf =O" = let 4eet...4+x"" De. (10.5.5) 


Suppose that w(u) = r; then the right-hand side has at least r terms of degree > n, 
while the terms in u can cancel at most rtermsin(1+x+...+x"~')e. So the total 
weight is >r+(n—r) =n. 

If q = 2 and deg f < n, we again have (10.5.5), where now deg u < n — 1. Each 
non-zero term in u will cancel a term in ] +x+...+x"~!', and this is exactly 
compensated by the corresponding term in x"u. Hence there are exactly n terms 
on the right of (10.5.5). | | 


Proposition 10.5.2. Let C be a QR-code with generator gy and let c = c(x) be a code 
word in C such that c(1) 40. Then 


J 


w(c)” > nm. (10.5.6) 
Moreover, if n = —1 (mod 4), then 


w(c) —w(c) +1 >. (10.5.7) 
If further, q = 2 and n = ~—1 (mod 8), then 


w(c) = —1 (mod 4). (10.5.8) 


Proof. The polynomial c(x) is divisible by gy but not by x — 1, because c(1) 4 0. 
For suitable s, uz; will transform c(x) into a polynomial c*(x) divisible by g, and 
again not by x—1. This means that cc* is a multiple of gg) = 
l+x+...+x"~', and so, by Lemma 10.5.1, w(cc*) > n. Now (10.5.6) follows 
because w(cc*) < w(c)wic*) = w(c)*. 

If n = —1 (mod 4), then —1 € N and so the operation x i> x~ ' transforms gy into 
g); thus c(x)c(x ~') is divisible by gog;. Now corresponding terms in c(x) and c(x~') 
give rise to a term of degree zero in c(x)c(x~ '), so there are at most wc)? — w(c) +1 
terms in all and (10.5.7) follows. 

Finally assume that n = —1 (mod 8) and q = 2; then (10.5.7) apphes. Further, 
any code word ¢ has degree <n; hence on writing r=degc, we have 
x'e(x” ')e(x) = fgog:, where f is a polynomial of degree <n. By Lemma 10.5.1, 
this product has weight exactly n, and writing d = w(c), we have d->—-d+1>n. 
Now consider how terms in c(x)c(x7!) can cancel. We have c=) x", 
c(x~') = }>x~" and a pair of terms in the product will cancel if r; — rj = 1m — 11. 
But in this case r, ~— rj = 1) — r, and another pair will cancel, so that terms cancel 
in fours. Hence we have d° —d+1—4t =n; therefore d~ — d = 2 (mod 4), and 
so d = —1 (mod 4). a 


392 Coding theory 


Let us now take q = 2. Then the condition that q is a quadratic residue mod n 
gives n = +1 (mod 8). Consider the polynomial 


reQ 
Since 6(x)? = }>x*" = O(x), it follows that 6 is idempotent. In particular, for the 
primitive element @ ef we have 6(a)” = O(a), so G(a) is 0 or 1. For any re . 
we have 6(a@') = 6(a), while for r € N, 6(a") + O(a) = ST Qf = = 1. If O(a) = 


replace a by a‘, where s € N; since (s,m) = 1, a* is again a primitive n-th root of : 
and @(a°) = 0. Thus for a suitable choice of @ we have 6(a) = 0. It follows that 


0 fie Q. 
Oa')= 2] ifie N, 
(n—1)/2 if :=0. 


If n = | (mod 8), then 6(a') vanishes exactly when i € QU {0}, so the code is then 
generated by (x — 1)gy. Similarly if » = —1 (mod 8), then 6(@') vanishes when 
i € Q, so in this case the generator Is gp. 

Let C(n) be the binary code defined in this way and C(n)* its extension by parity 
check. It can be shown that the group of C(n)” is transitive on the n +1 places. 
(These places may be interpreted as the points on the projective line over F,, and 
the group is then the projective special linear group, see van Lint (1982), p. 88.) 
Consider a word c € C of least weight d. Since the group is transitive, we may 
assume that the last coordinate in C* (the parity check) is 1. This means that c 
has odd weight d, say, and so c(1) = 1. Hence by Proposition 10.5.2, d=-—1 
(mod 4) and d°—d+1> 4. 

For example, for n = 7 we obtain the [7, 4]- Hamming code; here d = 3. A second 
(and important) example is the case n = 23. Here 


ee oe Beg? eae ng hi ge oe | 


Since 23 = —1 (mod 8), go is a generator for this code, which is known as the 
[23. 12]-Golay code. Since d- —d+1 > 23 and d=-—1(mod 4), it follows that 
d > 7, so the 3-spheres about the code words form a packing. On checking their 
size, we note the remarkable fact that 


“ fe 23 
¥2(23,3) = 1+ ( + +( = 2) 
2 3 


This shows that C(23) is a perfect code, with minimum distance d = 7. The extended 
code C~ is of length 24, with minimum distance 8, giving rise to the Leech lattice, 
a particularly close sphere packing in 24 dimensions. The symmetry group of a 
point in this lattice in R is the first Conway group .O (‘dotto’) of order 
2°°.37.57.7°.11.13.23 ~ 8.3 x 10!8, The quotient by its centre (which has order 2) 
is the sporadic simple group known as .1, discovered in 1968 by John Conway 
(see Conway and Sloane (1988)). 


10.5 Other codes 393 


Exercises 


l. 


De 


Construct the ternary [11, 6]-Golay code and verify that it is perfect. Find its 
weight enumerator. 

Construct the extended ternary [12, 6]-Golay code and find its weight enumerator. 
Is it self-dual? 


3. A binary self-dual code is called doubly even if all weights of code words are 


divisible by 4. Show that the extended [8,4]-Hamming code is doubly even. 
Show that if there is a [2k, k]-code which is doubly even, then k = 0 (mod 4). 


. Show that the extended binary [24, 12]-Golay code is doubly even. Find its weight 


enumerator. 


Further exercises on Chapter 10 


l. 


(The Plotkin bound) Prove that for any q-ary block code, if d > On, where 
6=1—q™', then Ag(n.d) < d/(d — @n). (Hint. Write the words of a maximal 
(n,M,d)-code as an M xn matrix and compute the sums of the distances 
between distinct words in two ways, along rows and along columns.) 


. Deduce the Plotkin bound of Proposition 10.3.3 from the general Plotkin bound 


of Exercise 1. 


. Show that the binary [n, k]-Hamming code has a parity check matrix whose 


columns are the numbers | to 2” — 1 in binary notation. 


. Show that the weight enumerator A(z) of the Hamming code satisfies the differ- 


ential equation (1 — 27)A” + (1+nz)A = (1 +2)". 


. Examine the linear q-ary codes with the property that for any code word 


Cc = (Co, Cy rr ey Cy es )s Lic) — (Ac,, —]J., Co, ewes Ch _ >) 1S again a code word, where 
A is a fixed element of F,. In particular consider the case A” = 1. 


. Let q be 2 or 3. Show that for a self-dual code the homogeneous weight enumerator 


is invariant under the transformations 
(x.y) > ([x + (gq — Dyl//q. (x - ¥)//q), (x, 97) > (x, wy), 


where w? = 1. Show that for q = 2 the group generated by these transformations 
has order 16. What is the order for general q? 


. Show that the weight enumerator of any binary self-dual code is a combination 


of g =z +1 and g = 2? + 1421+ 1 (the Gleason polynomials). (Hint. Apply 
Molien’s theorem, see Exercise 8 of Section 6.4, to Exercise 6.) 


. In any ISBN book number a ...@ 9 the final digit is chosen so that }| ka, = 0 


(mod 11) (using the digits 0, 1,..., 9, X ). Show that this allows single errors and 
transpositions to be detected, but not necessarily a total reversal (writing the 
number back to front). 


Languages and automata 


Many problems in mathematics consist in the calculation of a number or function, 
and our task may be to classify the different types of calculation that can arise. This 
can be done very effectively by describing simple machines which could carry out 
these calculations. Of course the discussion is entirely theoretical (we are not con- 
cerned with building the machines), but it is no accident that this way of thinking 
became current in the age of computers. Alan Turing, one of the pioneers of digital 
computers, used just this method in 1936 to attack decision problems in logic, by 
introducing the class of ‘computable functions’, i.e. functions that could be 
computed on a Turing machine. This development has had many consequences, 
most of them outside our scope. However, the simplest machines, automata, have 
an immediate algebraic interpretation. In the algebraic study of languages one uses 
simple sets of rules (‘grammars’) to derive certain types of languages, not mirroring 
all the complexities of natural languages, but more akin to programming languages. 
It turns out that these languages can also be described in terms of the machines 
needed to generate them, and in this chapter we give a brief introduction to algebraic 
languages and automata. 

The natural mathematical concept to describe these formal languages is the free 
monoid, and in Section 11.1 we discuss monoids and their actions. Languages 
form the subject of Section 11.2, while Section 11.3 introduces automata. The 
monoid ring of a free monoid is a free associative algebra, an object of independent 
interest, which in turn can be used to study languages, and Section 11.5 provides a 
brief account of free algebras and their completions (free power series rings). This is 
also the natural place to study variable-length codes (Section 11.4), which in their 
turn have influenced the development of free monoids and free algebras. 


11.1 Monoids and monoid actions 


We recall from BA, Section 2.1, that a monoid is a set M with a binary operation 
(x. y) > xy and a distinguished element 1, the neutral element or also unit element, 
such that 


M.1 x(yz) = (xy)z for all x.y, z € M (associative law), 
M.2 xl = lx =x. 


396 Languages and automata 


Groups form the particular case where every element has an inverse. As an example 
of a monoid other than a group we may take, for any set A, the set Map(A) = A* of 
all mappings of A into itself, with composition of mappings as multiplication and the 
identity mapping as neutral. Many of the concepts defined for groups have a natural 
analogue for monoids, e.g. a submonoid of a monoid M is a subset of M containing 1 
and admitting multiplication. A homomorphism between monoids M, N is a mapping 
f : M — N such that (xy)f = xf-yf, larf = In for x, y © M. Here we had to assume 
explicitly that the unit element is preserved by f, for groups this followed from the 
other conditions. A generating set of a monoid M is a subset X such that every 
element of M can be written as a product of a number of elements of X. For example, 
the set N of all natural numbers is a monoid under multiplication, with neutral 
element the number 1; here a generating set is given by the set of all prime numbers, 
for every positive integer can be written as a product of prime numbers, with 1 
expressed as the empty product. Likewise the set No = N U {0} is a monoid under 
addition, with neutral element 0 and generating set {1}. 

An example of particular importance for us in the sequel is the following monoid. 
Let X be any set, called the alphabet, and denote by X* the set of all finite sequences 
of elements of X: 


WERKE X We hci. tht S U: (11.1.1) 


Here we include the empty sequence, written as 1. We define multiplication in X* by 
juxtaposition: 


(Xe Viste ep ONY Sa) eo) es ee he eal es (11.1.2) 


The associative law is easily verified, and it is also seen that the empty sequence 1 is 
the neutral element. X* is called the free monoid on X. We remark that when 
X = @&, X* reduces to the trivial monoid consisting of 1 alone. This case will usually 
be excluded in what follows. Apart from this trivial case the simplest free monoid is 
that on a one-element set, {x} say. The elements are 1, x, x°.x°,..., with the usual 
multiplication. We see that {x}* is isomorphic to Ny, the monoid of non-negative 
integers under addition, by the rule » < x". Since the expression (11.1.1) for an 
element w of a free monoid is unique, the number r of factors on the right is an 
invariant of w, called its length and written |w]. 


The name ‘free monoid’ is justified by the following result: 


Theorem 11.1.1. Every monoid is a homomorphic image of a free monoid. 


Proof. Let M be any monoid and A a generating set. Take a set A’ in bijective 


correspondence with A and write F for the free monoid on A’. We have a mapping 
f .F > M defined by 


(4) 2.ca jf = Aju.0ty, (11.1.3) 


where a’ <> a is the given correspondence between A’ and A. Since every element of 
F can be written as a product a) ...a’ in just one way, f is well-defined by (11.1.3). 
It is surjective because A generates M, and f is easily seen to be a homomorphism 


by (11.1.2). ca 


11.1 Monoids and monoid actions 397 


Just as groups can be represented by permutations, so can monoids be realized by 
means of mappings. If M is any monoid, then by an M-set or a set with an M-action 
we understand a set S, with a mapping from S x M to S, written (s, x) i> sx, such 
that 


S.1 s(xy) = (sx)y for als € S,x,.y eM, 
S.2 sl=s. 


Writing for the moment R, for the mapping s1!— sx of S into itself, we can express 
S.1, S.2 as 


Ry, =R,Ry. Rp = 1. (11.1.4) 


This just amounts to saying that the mapping R:xt>R, is a monoid homo- 
morphism of M into Map(S). For example, M itself is an M-set, taking the multi- 
plication in M as M-action. This is sometimes called the regular representation 
of M. We can use it to obtain the following analogue of Cayley’s theorem for 
groups (BA, Theorem 2.2.1): 


Theorem 11.1.2. Every monoid can be faithfully represented as a monoid of mappings. 


Proof. Given a monoid M, we take the regular representation of M. If this is x1 p,, 
then p is a homomorphism from M to Map(M), by what has been said, and if 
Px = p,, then x = 1.p, = 1.p, = y, hence the homomorphism is injective. oi 


Let us return to a general monoid M and an M-set S. By Theorem 11.1.1 we can 
write M as a homomorphic image of a free monoid X*, for some set X. Thus we have 
a homomorphism 


X* — M — Map(S): 


this shows that any set S with an M-action can also be regarded as a set with an 
X *-action, where X corresponds to a generating set of M. 

A free monoid has several remarkable properties, which can also be used to 
characterize it. A monoid M is called conical if xy = 13> x=y = 1; M is said to 
have cancellation or be a cancellation monoid if for all x,y € M, xu = yu or ux = uy 
for some u € M implies x = y. Further, M is rigid if it has cancellation and whenever 
ac = bd, there exists z € M such that either a = bz or b = az. We observe that any 
free monoid is conical and rigid; the first property is clear from (11.1.2), because 
the product in (11.1.2) cannot be | unless r=s=0. To prove cancellation we 


note that in any element x,...x, #1 the leftmost factor x, is unique, as is the 
rightmost factor x,. Thus x,...x,=y,...y; can hold only if r=s and x;=y, 
6 ea) ee r). It follows that when xu = yu, say 

5 ee A) eee | ee 7 a 
then both sides have the same length and x;=y; (1=1,...,r= 5), therefore 


X= X)...X, = y)-.-¥, = y¥3 a similar argument applies when ux = uy. To prove 


398 Languages and automata 


rigidity, let ac = bd, saya =x,...x%,,b=)j))...¥no C= Ny...Uy,d =)... ¥%. Then 
we have 


Xx ‘ Xr lt) ete Ui}; = vy) aos VV) awer 2 Ve. 


By symmetry we may assume that r <5; then x; =))....,X, =y,, and hence 
b=XxX)...X,¥,2)...) = az, where z= y,,)...);. This shows a free monoid to be 
rigid. We remark that when ac = bd, then a = bz or b = az according as |a| is > 
or < |b|. 

By a unit ina monoid M we understand an element u such that v exists in M satis- 
fying wv = 1, vw = 1. For example, in a conical monoid the only unit is 1. When M 
has cancellation, it is enough to assume one of these equations, say uv = 1; for then 
(vu)v = v(uv) = v1] = lv, hence vu = 1 by cancellation, and similarly if vu = 1. 
Let us define an atom as a non-unit which cannot be expressed as a product of 
two non-units (as in rings). For example, in a free monoid the atoms are just the 
elements of length 1. This shows incidentally that in a free monoid the free generat- 
ing set is uniquely determined as the set of all atoms. We now have the following 
characterization of free monoids: 


Theorem 11.1.3. Let F be a monoid and X the set of all its atoms. Then F ts free, on X 
as free generating set if and only if F is conical and rigid, and 1s generated by X. 


Proof. We have seen that in a free monoid these conditions are satisfied. Conversely, 
assume that they hold; we shall show that every element of F can be written in just 
one way as a product of elements of X. Any a € F can be expressed as such a product 
in at least one way, because X generates F. lf we have 


ee, 9 r,s 0 ES iy x 


then by rigidity, x, = ))b or y,; = x,b for some Db € F, say the former holds. Since 
x|,¥) are atoms, b must be a unit and so b = 1 because F is conical. Thus x; = 1) 
and we can cancel this factor and obtain x....x, =12...1;. By induction on 


max(r.s) we find r-~l=s—1, ie. r=s and x» = 39,....x,=y,. Thus F 1s 
indeed free on X, as we had to show. Ci 
Exercises 


1. Show that every finite cancellation monoid is a group. 

Let a, b be any elements of a monoid M. Show that if ab and ba are invertible in M, 

then so are a and Bb, but this does not follow if we only know that ab is invertible. 

What can we say if aba is invertible? 

3. Show that every finitely generated monoid which is conical and rigid is free. 

4. Show that the additive monoid of non-negative rational numbers is conical and 
rigid but not free. 

5. Show that a submonoid of a free monoid is free iff it is rigid. Give examples 
of submonoids of free monoids that are not free. (Hint. Consider first the 
]-generator case.) 


tw 


11.2 Languages and grammars 399 


6. A set with an associative multiplication is called a sernigroup. Show that any semi- 
group S may be embedded in a monoid by defining S' = SU {1} with multi- 
plication xl = lx = x for all x ES. 

7. A zero in a monoid M is an element 0 such that 0x = x0 = 0 for all x € M. Verify 
that (1) a monoid has at most one zero, (ii) every monoid M can be embedded in 
a monoid M, with zero. If M already has a zero, how can the presence of these two 
zeros be reconciled with (1)? 


11.2 Languages and grammars 


Algebraic language theory arose from the attempt by Noam Chomsky to analyse and 
make precise the process of forming sentences in natural languages. The first point to 
notice is that whereas one may often test a sentence of a natural language like English 
for its grammatical correctness by checking its meaning, this is really irrelevant. This 
is well illustrated by Chomsky’s example of a meaningless sentence which is gram- 
matically correct: 


Colourless green ideas dream furiously. 
To emphasize the point, he confronts it with a sentence which is not correct: 
Furiously ideas green colourless dream. 


In principle the analysis (‘parsing’) of a sentence consists in determining its consti- 
tuents and checking that they have been put together correctly according to pre- 
scribed rules: 


The ore) jumped over the moon 
| | | | | | 
article noun verb preposition article noun 
subject object 


a ee - ee oe 


—~ sentence —~ 


The mathematical model consists of a set of rules of the form: sentence — {subject, 
verb}, noun — cow, etc., which will lead to all the sentences of the language and no 
others. This amounts to reading the above diagram from the bottom upwards. 

In order to write our sentences we need a finite (non-empty) set X, our alphabet. 
As we have seen, the free monoid on X is the set X* of all strings of letters from X, 
also called words in X (with a multiplication which we ignore for the moment). By a 
language on X we understand any subset of X*. Here we do not distinguish between 
words and sentences; we can think of an element of X* as a message, with a 
particular symbol of X as a blank space, to separate the words of the message. 


400 Languages and automata 


We single out a particular language by prescribing a set of rules according to which 
its sentences are to be formed. These rules constitute the grammar of the language 
and are formally defined as follows:A phrase structure grammar or simply grammar 
G consists of three sets of data: 


(i) An alphabet X; its letters are also called terminal letters. 

(ii) A set V of clause-indicators or variables, including a symbol o for a complete 
sentence. We write A= XUV. 

(iii) A finite set of rewriting rules: u — v, where u. v € A* and u contains at least one 
variable, i.e. u ¢ X*. 


A string of letters from A, i.e. a member of A®* is called terminal if it hes in X*, 
non-terminal otherwise. The rewriting rule u — v is applied by replacing a string fug 
over A by fvg. To obtain a sentence in our language we start from o and apply the 
rewriting rules until no variables (clause-indicators) are left. If the resulting terminal 
string is f, we write 0 —>—> f and call the sequence of rules applied a derivation of f, 
while f itself is a sentence of the language. In this way we obtain the language L(G) 
generated by the given grammar; it consists of all the strings on X that are sentences 
of the language. A language is called proper if it does not include the empty word 1. 

In giving example we shall use latin letters for the terminal letters and greek letters 
for the variables. With this convention it is not necessary to mention the alphabets X, 
V separately. 


Examples 


1. LD = {x"|n > 1}. Rules o > x, 0 — ox. We shall write this more briefly as o —> x; 


ox. A typical derivation is 0 > ox -> ox” — ox’ — x’, Similarly the language 


{x'""* “lm, n > O} is generated by the rules 0 — ox’; ox’; 1. 


2. L = {xy"{n => O}, also written xy*. Rules: 0 > x; 0 > oy. 
3. L = {x"y"|m.n > O}. Rules: a > xo; oy: 1. 

4. L = {x"™y"j0 < m < n}. Rules: o > xoy; oy: 1. 

5. L = {x"zy"|n = 0}. Rules: ¢ > xoy: z. 

6. The empty language L = @ has the rule o > o. 

7. The universal language X* has the rules o > ox: 1 (x € X). 


This concept of a language is of course too wide to be of use and one singles out 
certain classes of languages by imposing conditions on the generating grammar, as 
follows. The classification below is known as the Chomsky hierarchy. 

0. By a language of type 0 or a phrase structure language we understand any 
language generated by a phrase structure grammar. By no means every language is 
of type 0; in fact, since the alphabet and the set of rewriting rules are finite, the 
set of all languages of type 0 is countable, whereas there are uncountably many 
languages, because an infinite set has uncountably many subsets. It can be shown 
that the languages of type 0 are precisely the recursively enumerable subsets of X * 
(see e.g. M. Davis (1958)). 


11.2 Languages and grammars 401 


1. A language is said to be of type 1, or context-sensitive, or a CS-language if it can 
be generated by a grammar in which all the rewriting rules are of the form 


fog > fug, wherreae Vine A .f.geA*, (A =A*\{1}). (11.2.1) 


The grammar is then also called a CS-grammar. The rule (11.2.1) can be taken to 
mean: @ is replaced by u in the context fag. 

2. A language is said to be of type 2, or context-free, or a CF-language if it can be 
generated by a grammar with rewriting rules of the form 


a—->u, whereae€ Vi,ueA*. (11.2.2) 


The grammar is then also called a CF-grammar. The rule (11.2.2) means that @ is 
replaced by u independently of the context in which it occurs. 

3. A language is said to be of type 3, or regular, or finite-state if it can be generated 
by a grammar with rules of the form 


a->xfp.a—-1, wherexeX,a,peV. (11.2.3) 


Again the term regular is also used for the grammar. Here a is replaced by a variable 
following a letter or by 1. Instead of writing the variable B on the right of the 
terminal letter we can also restrict the rules so as to have f on the left of the terminal 
letter throughout. It can be shown that this leads to the same class of languages (see 
Exercise 4 of Section 11.3). 

If ¥, (t= 0.1, 2.3) denotes the class of all proper languages of type 1, then it is 
clear that 


A ee ae ee ae (11.2.4) 


in fact the inclusion can all be shown to be strict, but in general 1t may not be easy to 
tell where a given language belongs, since there are usually many grammars generat- 
ing it. Thus to show that a given language is context-free we need only find a CF- 
grammar generating it, but to show that a language is not context-free we must 
show that none of the grammars generating it is CF. 

We note the following alternative definition of grammars of type 1, 2: 


Proposition 11.2.1. Let G be a grammar with alphabets X, V. Then 
(i) If Gis a CS-grammar, then for every rule u — v in G, 

[u| < |v. (11.2.5) 
(ii) If G is a CF-grammar, then for every rule u— v in G, 

ag | =. (11.2.6) 


Conversely, if G satisfies (11.2.5), (11.2.6) resp., then there 1s a CS-grammar resp. a CF- 
grammar generating L(G). 


Proof. (i) Let G be a CS-grammar; any rule in G has the form fag — fug, where 
uZ#~1, hence |u| > 1 and so |fug| > |f|+1+|g| = |fag|, and (11.2.5) follows. 
Conversely, when (11.2.5) holds for every rule, we can achieve the effect of u —> 1 


402 Languages and automata 


by replacing the letters in u one at a time by a letter in v, taking care to leave a (new) 
variable until last. To give a typical example, if u = ujau.u3,v =v, . V5, we replace 
u —> v by the rules uyausu3; > 1) Buu, > v)Burvs > v,Bvyv; — v, where B does 
not occur elsewhere. 

(11) It is clear from the definition of a CF-language that its rules are characterized 
by (11.2.6); the details may be left to the reader. | 


Sometimes one may wish to include the empty word in a proper language. This is 
most easily done by replacing any occurrence of o on the right of a rule by a new 
variable A, say, for each rule with o on the left add the same rule with o replaced 
by | on the left and adding the rule o — 1. For example, to generate the language 
{x"zy".1|n > 0} we modify the example (11.2.5) above: o > xAyi 1,4 — xAy: z. 
If we just added o — | to the rules of 5, we would also get xy. 

From any improper CF-language L we can obtain the proper CF-language L\{1} 
by replacing in any CF-grammar for L, any rule a > 1 by B — u, where uw runs 
over all words obtained from derivations of the form B -— u, where u contains 
a, by replacing a in 1 by 1. For example, the language {x"y"|m + n > 0} is generated 
by o —> xo; oy; y3 x. 

Looking at the examples given earlier, we see that Examples 1, 2 and 3 are regular, 
as well as Examples 6 and 7. Examples 4 and 5 are context-free but not regular, as we 
shall see in Section 11.3. We conclude with an example of a CS-language which is not 
context-free, as Proposition 11.3.6 will show. 


Example 


8. {x"z"y"|n > O} has the generating grammar o — xody; xzurl., mA > Ap, 
zh — z-, u > y. The first two rules generate all the words x"zu(Au)"~ ', the 
fourth moves the A’s past the u's next to z and the next replaces each A by <. 
Finally each yz is replaced by y. To obtain the same language without 1 we 
simply omit the rule o > }. 


Exercises 

]. Find a regular grammar to generate the set of all words of even length in X. 

2. Show that each finite language is regular. 

3. Show that if L. L’ are any languages of type 7 ( = 0. 1. 2 or 3), then so are LUL’, 


LL’ = {uv|u € L, v € L'} and L” obtained from L by writing each word in reverse 
order. 

4. Show that if L is regular, then so is L’, the language whose words are all the finite 
strings of words from L. 

5. Show that regular languages form the smallest class containing all finite languages 
and closed under union, product and °. 

6. Show that every context-free language can be generated by a CF-grammar G 
with the property: for each non-terminal variable @ there is a derivation 
a@ — u(u eX") and for each terminal letter x there is a derivation a > u, 
where x occurs in 1. 


11.3 Automata 403 


7. Show that for any CF-grammar G = (X, V) there is a CF-grammar G’ producing 
the same language as G such that (i) G’ contains no rule a > f, where a, f € V, 
(11) if L(G) is improper, then G’ contains the rule a > | but no other rules with 
1 on the right-hand side and (iii) no rule of G’ has o occurring on the right. 
Thus all rules of G’ have the form o—> 1, a—x or a—f, where 
f €(XUV\{o})*, If = 2. 

8. Show that for a given CF-grammar G there exists a CF-grammar G’ producing the 
same language as G, with rules a > xf, f € V* and possibly o — 1 (Greibach 
normal form). 


11.3 Automata 


Logical machines form a convenient means of studying recursive functions. In 
particular, Turing machines lead precisely to recursively enumerable sets, and so 
correspond to grammars of type 0, as mentioned earlier. These machines are outside 
the scope of this book and will not be discussed further, but we would expect the 
more restricted types 1-3 of grammars to correspond to more special machines. 
This is in fact that case and in this section we shall define the types of machines 
corresponding to these grammars and use them to derive some of their properties. 
A sequential machine M is given by three sets and two functions describing its 
action. There is a set S of states as well as two finite alphabets: an input X and 
an output Y. The action is described by a transition function 6: S x X — S and an 
output function A:S x X — Y. To operate the machine we start from a given 
state s and input x; then the machine passes to the state 4(s.x) and produces the 
output A(s,x). In general the input will not just be a letter but a word w on X. 
The machine reads w letter by letter and gives out y € Y according to the output 
function A, while passing through the different states in accordance with the transi- 
tion function 6. The output is thus a word on Y, of the same length as w, obtained as 
follows. Define mappings 6’: S x X* > S,24°:S x X* — Y* by the equations 


SG Deas Osi = 6 (ae) eS e Xi ae XxX”. (11.3.1) 


A(s-1)=1, Als, ux) =A(s. wad (s. uw), x). (11.3.2) 


These equations define 6°. A’ by induction on the length of words. It is clear that 
5°.’ extend 6.4 respectively and so we may without risk of confusion omit the 
primes from 6’, A’. From (11.3.1) it 1s clear that 


d(s,])=s, S&(s. uv) = d(d(s.u),v) seSuvex. (11.3.3) 


so the mapping 6 just defines an action of the free monoid X* on S. We note that this 
holds even though no conditions were imposed on 6. 

From this definition it is clear that a machine is completely specified by the set of 
all quadruples of the form (s. x. A(s, x). d(s. x)). Sometimes it is preferable to start 


404 Languages and automata 


from a more general notion. Let S, X, Y be as before and define an automaton A as a 
set of quadruples 


P= P(A) CSxXxY « S: (11.3.4) 


The members of P are called its edges; each edge (s. x, y, s') has an initial state s, input 
x, output y and final state s’. For a sequential machine each pair (s.x) € S x X deter- 
mines a unique edge (s. x, y.s') and whenever our set P of edges is such that 


C. for each pair (s,x)€S x X there exists a unique y€ Y and s' €S such that 
(s.x.y,s') € P, 


then we can define A. 3 by writing ) = A(s, x). s’ = 6(s. x) and we have a sequential 
machine. Two edges are consecutive if the final state of the first edge is also the initial 
state of the second. By a path for A we understand a sequence u = (u)...., U,) of 
consecutive edges 


Ea CE ee ener 


Its length is n.sg is the initial and s, the final state, x;...x,, its input label and 
V1 .--¥n its output label. It is clear how an automaton can be represented by a 
graph with the set S of states as vertex set, each edge being labelled by its input 
and output. Sometimes one singles out two subsets I, F of $; a path is called 
successful if its initial state is in I and its final state in F. The set L(A) of all input 
labels of successful paths is a subset of X*, called the behaviour of A, or also the 
set accepted by A. We note that the output does not enter into the behaviour; 
when the output is absent (so that P now consists of triples (s,x.s")), A is called 
an acceptor. As an example consider an acceptor with states sy, 5), s2, input x,y and 
transition function 


The graph is as shown. If J = {sy}. F = {s,}, then the behaviour is xv"; for I = {so}, 
F = {so} the behaviour is xy*xxX* UyX"*. 

An automaton is said to be complete if it satisfies condition C above (so that we 
have a sequential machine), and the set J of initial states consists of a single state. 
To operate a complete acceptor we take any word in X* and use it as input with 
the machine in state J; this may or may not lead to a successful path, i.e. a path 


11.3 Automata 405 


ending in F. We shall be interested in its behaviour, i.e. the set of input labels 
corresponding to successful paths. Thus the above example is a complete acceptor; 
we note how the graph makes it very easy to compute its behaviour. In constructing 
an acceptor it is usually convenient not to demand completeness, although complete 
acceptors are easier to handle. Fortunately there is a reduction allowing us to pass 
from one to the other: 


Proposition 11.3.1. For each acceptor A there is a complete acceptor C with the same 
behaviour. If A is finite (1.e. with a finite set of states), then so is C. 


Proof. Let the set of states for A be S, with initial state J and final state F. We take C 
to be on the same alphabet as A, with state set the set of all subsets of S, initial state 
{I} and final set of states all sets meeting F. The transition function for C is given by 
6(U.x) = V, where V consists of all states v such that (4, x, v) is an edge in A for 
some u € U. It is clear that C has the same behaviour as A, and it is easily seen to 
be complete. 2 


We remark that every subset Y of X* is the behaviour of some acceptor; we take 
X* as state set, | as initial state and Y as final set of states, with right multiplication as 
transition function. Our aim is to describe the behaviour of finite acceptors; we shall 
find that this consists precisely of all regular languages. To prove this result we shall 
need to construct a ‘minimal’ acceptor for a given language. 

We shall use the notation (S,1, F) for a complete acceptor, where S is the set of 
states, 7 is the initial state and F is the set of final states; the alphabet is usually 
denoted by X and so will not be mentioned explicitly, and the transition function 
is indicated by juxtaposition; thus instead of 5(s, x) = s’ we write sx = s’. A state s 
in an acceptor A = (S,i, F) is said to be accessible if there is a path from 7 to s, 
coaccessible if there is a path from s to a state in F. As far as the behaviour of A 
is concerned, we can clearly neglect any states that are not both accessible and co- 
accessible. If every state of A is both accessible and coaccessible, then A is called trini. 

Given two acceptors A, A’ with state sets S, S’, we define a state homomorphism 
from A to A’ as a map f :S—S’ such that (s.x. the A> (sf.x.f) eA. If f 
has an inverse which is also a state homomorphism, f is called an isomorphism. 
To give an example, let us put 


L,={veX"|s.v € F}; 


then L, consists of the set of words which give a successful path with s as initial state. 
Two states s, tare called separable if L. £ L,, inseparable otherwise. If any two distinct 
states are separable, the acceptor is said to be reduced. Every acceptor (finite or 
infinite) has a homomorphic image which is reduced and has the same behaviour; 
to obtain it we simply identify all pairs of inseparable states. 

For every subset Y of X* we can define a reduced acceptor A(Y ) whose behaviour 
is Y. The states of A(Y) are the non-empty sets 


u-'Y = {ve X"|uve Y}, 


406 Languages and automata 


where u ranges over X*. The initial state is ]~ 'Y = Y and the final states are the 
states u~ 'Y containing 1. The transition function is defined by 


Zu=zu 'Z. whereu eX. (11.3.5) 


This is a partial function (i.e. not everywhere defined) since u~'Z may be empty, but 
it is single-valued. If for u we take a word in X, we have, by induction on the length 
of 1, 


w € Z.ux = (Z.u)x @ wx eu 'Z & uxw € Z. 
This shows that (11.3.5) holds for any u € X*. As a consequence we have 
WEL(A(Y)) S1lEYwewey, 


which shows that the behaviour of A(Y ) is indeed Y. We shall call A(Y ) the minimal 
acceptor for Y; its properties follow from 


Theorem 11.3.2. Let A = (S, 1, F) be a trim acceptor and put Y = L(A), the behaviour 
of A. Then there is a state homomorphism g : A — A(Y) to the minimal acceptor for Y 
which ts surjective on states, given by 


g:sioLl={vEeX"*|s,v € F}. (11.3.6) 


Proof. Since A is trim, any state s in S is accessible, so iu = s for some u € X*; 
further, s is coaccessible, so sv € F for some v € X* and it follows that L; defined 
by (11.3.6) is non-empty. Thus q is well-defined. To show that it is a homo- 
morphism we have to verify that when s.x = t, then L,.x = L,;. But we have 


WEL.x@G@awel &OsxaweF owe L,: 


so y is indeed a homomorphism. It is surjective, because if uw 'Y # @, then there is 
a successful path in A with label uv, where v € Y. Now 


veu YeweYss=iweFove ly. 


thus wu” 'Y = L and this shows (11.3.6) to be surjective. | 
As we have seen, for any subset Y of X* there is a reduced acceptor with behaviour 


Y; taking this to be A in Theorem 11.3.2, we find y in this case to be an isomorphism, 
by the definition of ‘reduced’. It follows that A must be reduced. 


Corollary 11.3.3. The minimal acceptor for any subset of X* is reduced. go 


Of course this is also not hard to verify directly. 
We can now establish 


Theorem 11.3.4. A language is regular if and only if it is the precise set accepted by a 
finite acceptor. 


11.3 Automata 407 


Proof. Let A = (S.1, F) bea finite acceptor with behaviour Y and write the transition 
function as 6 for clarity. For our grammar G = {X, V. >} we take X to be the input 
of A and V = S, the set of states, with o = i, the initial state. For each state a in S and 
x € X we include in G the rule 


a->xp if d(a,.x) = B, (11.3.7) 
and for each final state w we include the rule # > 1. Given any word w= x,...x,, 
let us put d(i,x;) = 5),.... 6(s,-1,X,) =s,. Then the rules (11.3.7) include 
i> X]S).---,$;-1 —> X7S,, hence 06 > x15) > X)2oS. >... > X,...x,5,. If 5, € F, 


then s,— 1 and x,...x, is included in L(G). On the other hand, if 
x,...x, € L(G), consider the rules in G: they are all of the form a — xf or 
a@ —>» 1 and the number of variables is constant in the application of the former 


rule and decreases by 1 when the latter is applied. Thus any derivation of x, ....x, 
must be of the form 1 — x5), 5) > X252,....5,—~) —> X;S;, $ —> 1. This means that 
d(s;-1,X%,) =s;) (1=1,...,1r) and s, € F, so x; ...x, is accepted by A. 


Conversely, let G be a regular grammar, with derived language L(G). For our 
acceptor A we take the alphabet X of G as input and the set V of variables as state 
set, with o as initial state and a triple (a. x. 8) for each rule w — xf, while the 
final state set consists of all a such that a > 1. Then it is clear that the derivations 
of G correspond precisely to the successful paths in A; the details may be left to the 
reader. Hence L(G) is the set accepted by A. B 


The acceptor constructed in this proof may not be complete, but it is trim 
provided that any superfluous variables have been removed from G. It follows by 
Theorem 11.3.2 that for a regular language the minimal acceptor is finite. This 
provides a practical way of determining whether a language is regular: Y is a regular 
language iff its minimal acceptor A(Y’) is finite. We illustrate this result by some 
examples. 


Examples 


1. {xy"|+ > O}. The minimal acceptor has states s; = xy* and s; = y*, and its opera- 


tion is given by the table: 
Xx 
uf 
5] 
SO ——ae FS 


408 Languages and automata 


The initial state is so and the final state is s;. The behaviour can be read off from 
the graph. 

2. {x"y|n > O}. Here the states are sy = x*y and s, = {1}, with initial state sy and 
final state s): 


SQ. S] 
x So SO y S] 
x 
y 5] 


3. {x"y"|n > O}. We have the states sy = {x"y"|n > 0}, 5; = {x"y"*!|n > O}, 
sy = {x"y"t2\|n > O},....t0 = {1}, = {yy}, b = {y"}..... The initial state is sp 
and the final states are so, ty, while the operations are given by the table below: 


SO x $] x 5? x oe eae 
O-—__—__—_—_> 
y y y 
f rs ne 
0 1 2 3 


It should be clear from these examples how the behaviour of an acceptor may be read 
off from its graph. We note that in none of the cases is the acceptor complete; the 
transition function, though single-valued, is not everywhere defined. But this does 
not impair its usefulness; in any case we could replace it by a complete acceptor, 
using Proposition 11.3.1. Sometimes it is easier to test for regularity by means of 
the following necessary condition: 


Proposition 11.3.5. Let L be an infinite regular language on X. Then there exist 
wiy.w EX* y £1, such thatwy"w" EL forn=1,2,.... 


Proof. Since L is regular, there is an acceptor A for L, with a finite number, m say, of 
states. The language L is infinite on a finite alphabet, so it contains a word w of 
length > m. The acceptor reads w letter by letter, moving from state to state as it 
reads. Since there are more than m steps, two of these states must be the same; 
say Ss; occurs twice. If y is the portion read between the first time and the second 
time in s;, then w=w'yw” and y#1. Clearly our machine will also accept 
w'y-w", and generally w’y"w”; thus w’y"w” € L. 

For context-free languages there is a condition similar to that in Proposition 
11.3.5; this is sometimes known as the pumping lemma: 


Proposition 11.3.6. Let G be a CF-grammar. Then there exist integers p, q such that 
every word w of length greater than p in L(G) can be written as w = w'uzvw’, where 
uv #1, |juzv| < q and w'u"zv"w" € L(G) forall n> 1. 


11.3 Automata 409 


Proof. The rules of G are all of the form a@ — f, t € A*. Suppose the number of vari- 
ables is k and choose p so large that every word w € L(G) of length > p has more 
than k steps in its derivation. Each of these steps has the form @ — t; hence some 
variable occurs twice, so the part of the derivation from the first to the second occur- 
rence of a reads a > ... > uav, where uv F 1. It follows that w = w'uzvw”, where 
a occurs only once in the derivation a — ... — z, which therefore has at most k 
steps. Therefore uzv has bounded length, and by repeating the steps between a 
and uav we obtain w’u"zv"w” for all n > 1. La] 


Proposition 11.3.5 shows that the language {x"zy"} which we saw to be context- 
free, is not regular, and Proposition 11.3.6 shows that {x"y"z"} is not context-free. 

Let us return to the example {x"zy"}; we have just seen that it is not regular, and so 
cannot be obtained from a finite acceptor. Intuitively we can see that a finite acceptor 
does not have the means of comparing the exponents of x and y. To make such a 
comparison requires a memory of some kind, and we shall now describe a machine 
with a memory capable of accepting CF-languages. The memory to be described is of 
a rather simple sort, a ‘first-in, last-out’ store, where we only have access to the last 
item in the store. 

A pushdown acceptor (PDA for short) is an acceptor which in addition to its set of 
states S and input alphabet X has a set = of store symbols, with initial symbol A, and a 
transition function 6:S x X x © + Sx &*; but for a given triple of arguments 
there may be several or no values. At any stage the machine is described by a 
triple (s;, w,a), where s; € S, we X*, a € X*. We apply 5 to the triple consisting 
of s;, the first letter of w and the last letter of a. If w=xw’, a =a’A say, and 
(sj. B) is a value of 4(s,,x. A), then 


(s;.xw aA) > (5;,w,a@ Bp) 


is a possible move. Thus the effect of 5 is to move into a state s;, remove the initial 
factor x from w and replace the final letter A of a by 8. We say that a word w on X is 
accepted by the machine if, starting from (sp, w, Ao) there is a series of moves to take 
us to (s,, 1, y), where so is the initial and s, a final state. With this definition we have 


Theorem 11.3.7. The context-free languages constitute the precise class of sets accepted 
by pushdown acceptors. | 


We shall not give the proof here (see e.g. Arbib (1969)), but as an example we 
describe a PDA for {x"y"|n > 1}. Its states are so.s,, 52, where Sq is initial and s> 
final. The store symbols are A (initial symbol), u,v. We give the values for 5(5;,.,.) 
in the form of a table for each s;: 


: 
SoH SOM V SoU” x 
Sop S| y 


410 Languages and automata 
Blanks and the remaining values (for s;) remain undefined. To see how x"y" is 
accepted, but no other strings, we note how the store acts as a memory, remembering 
how many factors x have been taken off. If we think of the store as arranged verti- 
cally, at each stage we remove the topmost symbol and add a number of symbols at 
the top, rather like a stack of plates in a cafeteria; this explains the name. 

By a somewhat more elaborate process, with a tape on which the input is written 
(a ‘linear-bounded’ automaton) one can devise a class of machines which accept 
precisely all the CS-languages (see Landweber [1963]). These machines are more 
special than Turing machines in that their tape length is bounded by a linear func- 
tion of the length of the input word. 

Finally we note the following connexion with monoids: 


Theorem 11.3.8. A language L on X is regular if and only if there 1s a homomorphism 
f :X* — M toa finite monoid M such that L is the complete inverse image of a subset 
Nop Mi L=f-7(N). 


Proof. Given a regular language L on X, we have a finite acceptor A which accepts 
precisely L. Now A defines an action of X* on the set S of states of A. Thus 
we have a homomorphism f : X* — Map(S). If P is the subset of Map(S) of all 
mappings taking so into the set F of final states, then we L iff wf € P; thus 
L=f~'(P), and by definition Map(S) is a finite monoid. Thus the condition is 
satisfied. 

Conversely, given f : X* + M with L=f~'(N) for some N C M, we consider 
the acceptor 4 consisting of the alphabet X, state set M, action 4(a, x) = a.(xf), 
(a € M.x € X), with neutral element | as initial state and N as set of final states. 
The language accepted by 4 is just L. 13 


Exercises 


1. Find the languages generated by the following grammars: (i) 0 —> o7: x: y, (ii) 
o> 0.x. ¥, (ili) o > 07; xay: yox: 1, (iv) o > xox: xyx; x. 

2. Find a CF-grammar on x, y generating the set of all words in which «x is 
immediately followed by y. 

3. Find a grammar on x, y generating the set of all words in which each left factor 
has at least as many x's as 3s. Is this a CF-language? 

4. A grammar with rules of the form a — x, a — Bx is sometimes called left 
regular, while a grammar with rules of the form a > x, a > xf is called right 
regular. Show that every language generated by a left regular grammar can 
also be generated by a right regular grammar. (Hint. Interpret the words of 
the language as circuits in the acceptor graph; a left and a right regular grammar 
correspond to the two senses of traversing these loops.) 

5. Show that every CF-language in one letter is regular. 

6. Show that a language in a one-letter alphabet {x"|n € I} is regular iff the set J of 
exponents is ultimately periodic. 


11.4 Variable-length codes 411 


7. Construct a PDA for the set of all palindromes with ‘centre marker’, i.e. L(G), 
where G: 0 — xa@x; yoy: z (Hint. Put one half of the word in store and then 
match the other half.) 

8. Find a PDA for the set of all even palindromes G: 0 — xox; yoy; 1. (Hint. 
Construct a PDA to ‘guess’ the centre.) 

9. An automaton is called deterministic (resp. total) if for each s € S, x € X there 
exists at most (resp. least) one pair y € Y, s’ € S such that (s, x, y, s’) € P(A). 
For any A define its reverse A° as the automaton with state set S, input Y, 
output X and P(A°) as the set of all (s‘', y, x, s) € A. Show that A® is deterministic 
whenever 4 1s reduced. 

10. A complete automaton A with N states s;,..., sy can be described by a set of 
N x N matrices P(x|y) (x € X.y € Y) where the (i,j)-entry of P(x|j’) is 1 if 
d(s;.x) =y and A(s;,x) =s5;, and 0 otherwise. Define P(u|v) recursively for 
ueX*, ve Y* by Plux|vy) = Plulv)P(x|y), P(u|v) = 0 if |u| 4 |v]. Show that 
P(uu'|vv') = P(ulv)P(u'|v’). Further put P(x) = >, P(xly), write wz for the 
row vector whose i-th component is | if s; is the initial state and 0 otherwise 
and write f for the column vector with component | for a final and 0 for a 
non-final state. Show that for any u € X*, mP(u)f is 1 if u is accepted and 0 
otherwise. 

11. With the notation of Exercise 10, put T = )_, P(x); verify that the (7,j)-entry of 
T" is the number of words of length n in X which give a path from 5s; to s;. Show 
that if A(m) denotes the number of words of length n accepted, then the length 
generating function L(t) = >°A(m)t" satisfies L(t) = m(I — tT)~ f Use the 
characteristic equation of T to find a recursion formula for A(#). 


11.4 Variable-length codes 


The codes studied in Chapter 10 were block codes, where each code word has the 
same length. However, in practice different letters occur with different frequencies 
and an efficient code will represent the more frequently occurring letters by the 
shorter code word, e.g. in Morse code, the letters e, t which are among the most 
commonly occurring are represented by a dot and a dash respectively. For this 
reason it is of interest to have codes in which the code words have varying lengths. 
In this section we shall describe such codes; the main problem is to design the code 
so as to ensure that messages can be uniquely decoded. Of course we can only take 
the first steps in the subject, but these will include results which are of interest and 
importance in general coding theory, besides leading to a better understanding of 
free monoids and free algebras. 

Let X = {x,,....x,} be our alphabet and X” the free monoid on X. Any subset A of 
X * generates a submonoid, which we write as (A). By a code on X we understand a 
subset A of X* such that every element of (A) can be uniquely factorized into 


412 Languages and automata 


elements of A; in other words, (A) is the free monoid on A as free generating set. 
Thus if w € (A) and 


W=@,...Am =bd,...bdy, ai,b, € A, 


then n1 =n anda, =b,;,i=1,...,n. 

For example, X itself is a code; more generally, X", the set of all products of 
length » (for any given n > 1) is a code. Further, any subset of a code is a code. 
The set {x, xy, yx} is not a code, because the word xyx = xy.x = x.yx has two distinct 
factorizations. 

Our first problem is how to recognize codes. If A is not a code, then we have an 
equality between two distinct words in A, and by cancellation we may take this to be 
of the form 


au= bv, wherea,be A, u,v € (A). 
If we express everything in terms of X we find by the rigidity of x", 
either a = bz or b= az, for some z € X*. 


If a = bz, we say that b is a prefix of a. Let us define a prefix set as a non-empty subset 
of X* in which no element is a prefix of another. For example, {1} is a prefix set; any 
prefix set A # {1} cannot contain 1, because | is a prefix of any other element of X*. 

What we have just found is that if A is not a code and A # {1}, then A is not a 
prefix set; thus we have 


Proposition 11.4.1. Every prefix set # {1} is a code. ea 


By a prefix code we shall understand a prefix set # {1}. To give an example, 
{y. xy, x°y. xy} is a prefix set and hence a prefix code. We can think of this example 
as the alphabet 1, x, x-, x° with y as place marker. 

By symmetry we define a suffix set as a non-empty subset of X* in which no 
element is a suffix, i.e. right-hand factor, of another. Now the left-right symmetry 
of the notion of code shows that every suffix set 4 {1} is a code; such a code will be 
called a suffix code. Since there exist suffix codes which are not prefix, e.g. {x, xy}, 
we see that the converse of Proposition 11.4.1 is false. In fact there exist procedures 
for determining whether a given subset of X* is a code, but they are quite lengthy 
and will not be given here (the Sardinas—Patterson algorithm, see Lallement (1979), 
Berstel and Perrin (1985)). In any case, prefix codes are of particular interest in 
coding theory, since any message in a prefix code an be deciphered reading letter- 
by-letter from left to right (it is a “zero-delay’ code). This property actually charac- 
terizes prefix codes, for if a code is not prefix, say a = bz, where a, b are both code 
words, then at any occurrence of b in a message we have to read past this point to 
find out whether a or 5D is intended. 

On any monoid M we can define a preordering by left divisibility: 


u<v@v=iuz forsomeze M. (11.4.1) 


11.4 Variable-length codes 413 


Clearly this relation is reflexive and transitive; we claim that when M is conical, with 
cancellation, then ‘<’ is antisymmetric, so that we have a partial ordering. For if 
u<v,v <u, then v= uz, u= vz’, hence v = uz = vz'z, so z’z = 1 by cancellation, 
and since M is conical, we conclude that z = z’ = 1. 

We shall be particularly interested in the ordering (11.4.1) on free monoids. In that 
case the set of left factors of any element 1 is totally ordered, by rigidity, and since the 
length of chains of factors is bounded by |x|, the ordering satisfies the minimum 
condition. In terms of the ordering (11.4.1) on a free monoid, a prefix set is just 
an anti-chain, and by BA, Proposition 3.2.8 there is a natural bijection between 
anti-chains and lower segments. Here a ‘lower segment’ is a subset containing 
with any element all its left factors; such a set, if non-empty, is called a Schreier set; 
it is clear that every Schreier set contains 1. 

Let us describe this correspondence between prefix sets and Schreier sets more 
explicitly: if C is a prefix set in X*, then the corresponding Schreier set is the com- 
plement of CX” in X*; for a Schreier set P the corresponding prefix set is the set of all 
minimal elements in the complement of P. 

A prefix set C is said to be right large if CX* meets wX* for every w € X*. By the 
rigidity of X* this just amounts to saying that every element of X* is comparable 
with some element of C. Hence the Schreier set corresponding to a right large 
prefix set C consists precisely of the proper prefixes of elements of C. To sum up 
these relations we need one more definition. In any monoid a product AB of subsets 
A, Bis said to be unambiguous if each element of AB can be written in just one way as 
c= ab, whereae€ A, be B. 


Proposition 11.4.2. Let X* be the free monoid on a finite alphabet X. Then there is a 
natural biection between prefix sets and Schreier sets: to each prefix set C corresponds 
P = X*\CX™ and to each Schreier set P corresponds C = PX\P, i.e. the set of minimal 
elements in X*\P, and we have the unambiguous product 


x” SGP. (11.4.2) 
Moreover, P 1s finite if and only if C 1s finite and right large. 


Proof. The description of each of C, P in terms of the other follows by BA, Proposi- 
tion 3.2.8. Thus C is the set of minimal elements in X*\P; since 1 € P, any such 
minimal element must have the form px (p € P.x € X), and moreover, px ¢ P. Thus 


C = PX\P = {px|p e P,x © X. px ¢ P}. (11.4.3) 


Now to establish (11.4.2), take w € X*; either w € P or w has a maximal proper 
prefix p in P. In the latter case w= pxu, where x € X and px ¢ P; therefore 
px € C by (11.4.3). Further |1z{ < |w| and 1 € P, so by induction on the length, 
u € C*P, hence w € C*P and (11.4.2) follows. Now if 


W=,...U;p=V,)...¥6q, Where p,q e€ Piu.y eC, 


then since C is prefix, u; = v,;, so we can cancel u, and conclude by induction on r 
(hat S359 SS See r, p = q. This shows (11.4.2) to be unambiguous. 


414 Languages and automata 


Suppose that P is finite; then so is C, by (11.4.3). Given w € X*, either w € P; then 
some right multiple of w is not in P, because the length of the elements in P is 
bounded. A least such element c is a right multiple of w in C, so w < c. Or w ¢ P; 
then since 1 € P, there is a minimal prefix c of w not in P and this is again in C, 
so c < w. This shows C to be right large. 

Conversely, suppose that C is finite right large and let P be the corresponding 
Schreier set. Given w € X*, either w >c or w <c for some c € C, and the first 
alternative is excluded for members of P. Thus P consists of all proper prefixes of 
elements of C and this is again a finite set. + | 


For a closer study of codes it is useful to have a numerical measure for the 
elements of X*. By a measure on X* we understand a homomorphism yp of X* 
into the multiplicative monoid of positive real numbers, such that 


y EoOS 1. (11.4.4) 


xex 


Clearly w(1) = 1 and the value of u on X can be assigned arbitrarily as positive real 
numbers, subject only to (11.4.4); once this is done, 4 is completely determined on 
X* by the homomorphism property. For example, writing m(x) = r~', we obtain 
the uniform measure on X*: 


— |w. 


m{iw) =r 


Any measure 4: on X* can be extended to subsets by putting 


H(A) = D> ula). 


aver 


We note that (A) is a positive real number or 90, and uw{X ) = 1 by (11.4.4). Fora 
product of subsets we have 


(AB) < w(A)pe(B), (11.4.5) 


with equality if the product is unambiguous. To prove (11.4.5), let us first take A, B 
finite, say A = {a)...., init SAD in das b,,}. We have 


Yo wlaiby) = Y> wlaidyerb)) = (S° ula) (Yo (6), 


and here each member of AB occurs just once if AB is unambiguous, and otherwise 
more than once, so we obtain (11.4.5) in this case, with equality in the unambiguous 
case. In general (11.4.5) holds for any finite subsets A’. B’ of A, B by what has been 
shown. Therefore (A‘B’) < w(A)u(B), and now (11.4.5) follows by taking the 
limit. 

In particular, for any code C, the product CC is unambiguous by definition, hence 
on writing C* = CC etc., we have 


HO \eSHE: pai... (11.4.6) 


11.4 Variable-length codes 415 
Let us apply (11.4.6) to X; we have u(X) = 1 by (11.4.4) and X is clearly a code, 
hence we find 

w(X") = 1 forall n > 1. (11.4.7) 


We shall need an estimate of (A) for finite sets A. We recall that X* = X*\{1}. 


Lemma 11.4.3. Let 2 be any measure on X*. Then for any finite subset A of X*, 
L(A) < max{la|la € A}. (11.4.8) 


Proof. Since A is finite, the right-hand side of (11.4.8) is finite, say it equals d. Then 
ACXUX?U...UX*%, and so by (11.4.7), 


uA) < wX) + u(X-) +... + (XK) =. | 


From this lemma we can obtain a remarkable inequality satisfied by codes, which 
shows that to be a code, a set must not be too large. 


McMillan Inequality. Let C be any code on X*. Then for any measure 2 on X we have 


jC) (11.4.9) 


Proof. Consider first the case where C is finite and let max{|c||c € C} = d. Then by 
(11.4.6), (11.4.8), 


piG)” = pC) = nd, 


since the elements of C” have length at most nd. Taking n-th roots, we find that 
u(C) <(nd)''". Here d is fixed; letting nm — oo, we have (nd)''" — 1, therefore 
u(C) <1. In the general case every finite subset of C is a code and so satisfies 
(11.4.9), hence this also holds tor C itself. |= | 


Of course the condition (11.4.9) is by no means sufficient for a code, since any set 
C will satisfy (11.4.9) for a suitable measure, if we choose X large enough. 

A code on X is said to be maximal if it is not a proper subset of a code on X. 
Maximal codes always exist by Zorn’s lemma, since the property of being a code 
is of finite character. The above inequality provides a convenient test for maximality: 


Proposition 11.4.4. Let C be a code on X. If u(C) = 1 for some measure j4, then Cis a 
maximal code. 


Proof. Suppose that C is not a maximal code. Then we can find a code B containing 
C and another element b, say. We have x(B) > u(C) + u(b) > 1, and this contra- 
dicts (11.4.9). |= | 


For example, X and more generally X” for any n is a maximal code. 


416 Languages and automata 


Although the inequality (11.4.9) is not sufficient to guarantee that C is a code, 
there is a sense in which this inequality leads to a code. This is expressed in the 
next result, which gives a construction for codes with a prescribed uniform measure. 


Theorem 11.4.5. Let 1,, n2,... be any sequence of positive integers. Then there exists a 
code A = {a,,@2,...} with |a;| = n,; in an alphabet of r letters if and only if 


ro ™ +r +4...< 1 (Kraft-McMillan inequality). (11.4.10) 
We remark that the left of (11.4.10) is just the uniform measure of A. 


Proof. The necessity of (11.4.10) is clear by (11.4.9) and the above remark. 
Conversely, assume that (11.4.10) holds, take the n, to be ordered by size: 
ni, <n. <..., and let X = {0,1,...,r—1} be the alphabet. Define the partial 
sums of (11.4.10): 


Cay ge ee ae 
and for each s; define an integer 
Peat Seat ee ae 
Each p; is an integer and since s; < 1 by (11.4.10), we have 
O=p. =r”. 


Now take a; to be the element of X* formed by expressing p; in the scale of r, with 
enough 0’s prefixed to bring the length up to n: 


Zl 20 
Ay = 0)02...a,, =r” +aar™ "+... +04, -17 + On. a; EX. 


We claim that A = {a), a, ...} is a code of the required type. The lengths are right 
by construction, and we shall complete the proof by showing that A is a prefix code. 
If a; is a prefix of aj, j < 1, then a, is obtained from a; by cutting off the last n; — 1; 
digits. Thus p; is the greatest integer in the fraction 

Pie P'S sea °2) Sep a lk 


aL 


But this is a contradiction. Thus A is indeed a prefix code. «| 


For example, take r = 3 and consider the sequence 1, 1, 2, 2, 3, 3, 3. We have 
w(A) = 1/3 +1/3+ 1/94 1/9 + 1/27 + 1/27 + 1/27 = 1, so we have a maximal 
code. It is given by the table: 


The construction of a; given in Theorem 11.4.5 can be described by the rule: choose 
the least number in the ternary scale which is not a prefix of 222 and which has no 
a, (i < k) as a prefix. 


11.4 Variable-length codes 417 


We have seen that codes are certain subsets of X* that are not too large; we now 
introduce a class of subsets that are not ‘too small’, in order to study the interplay 
between these classes. A subset A of X* is said to be complete if the submonoid 
generated by it meets every ideal of X*, i.e. every word in X* occurs as a factor in 
some word in (A): 


X*wX*O(A)4@_ forall we X*. 


Proposition 11.4.6. Let A be a finite complete subset of X* and let m be the uniform 
measure on X. Then m(A) > 1. 


Proof. Let L be the set of prefixes, R the set of suffixes and F the set of factors of 
members of A. Since A is finite, L, R and F are all finite. We claim that 


R(A)LUF=X"*. (11.4.11) 
For, given w € X*, we have, by the completeness of A, 
pwq = a)@2...an, 


where a; € A. Now either w occurs as a factor in some a; and so w € F, or two or 
more of the a; form part of w. In that case w consists of a word in A with a suffix 
of some a, on the left and a prefix of some a; on the right, and this is just 
(11.4.11). Now (11.4.11) shows that m(X*) = m(R)m((A))m(L) + m(F), and since 
m(X*) is infinite, while m(F ), m(R), m(L) are finite, it follows that m((A)) is infinite. 


Thus 
= m(({A)) < S| m( (A") < Sy m(A)". 


If (A) < 1, this is a contradiction, therefore m(A) > 1, as claimed. B 
We next establish a connexion with codes: 


Theorem 11.4.7 (Schutzenberger). Any maximal code is complete. 


Proof. Let A be a code which is not complete; we shall show how to enlarge it. If 
|X| = 1, any non-empty set is complete, so we have A = @ and then X is a larger 
code. When |X| > 1, we have to find b ¢ A such that A U {b} is a code. Since A Is 
not complete, there is a word c € X* such that X*cX* M (A) = @. One might be 
tempted at this point to adjoin c to A, but this leads to problems because c might 
intersect itself; we shall construct b to avoid this. Let |c| = y and put c= xc’, 
where x € X. Choose y ~£ x in X and put 


boy S02" (11.4.12) 
From the definition of c it is clear that 


X*bX*N (A) =@. (11.4.13) 


418 Languages and automata 


We claim that A U {b} is a code. For if not, then we have an equation 
a). .a,=a\...a,, where a,,a,€ AU {b}, (11.4.14) 


and (11.4.14) is non-trivial. Since A is a code, b must occur in (11.4.14), and by 
(11.4.13) it must occur on both sides of (11.4.14), say a, =a; = b for some 1, j. 
We take 7, j minimal; if the two occurrences of b do not overlap, we have a contra- 
diction to (11.4.13) (see diagram) 


b 


(A) b 


If there is an overlap, it must be at least y letters, by (11.4.12), but that is impossible, 
because in b letters 1 and y +1 are x, whereas the last y letters of b are y. | 


We note that the converse result does not hold (see Exercise 5). Our final result 
clarifies the relation between complete sets and codes. 


Theorem 11.4.8 (Boé, de Luca and Restivo [1980]). Let A be a finite subset of X. 
Then any two of the following imply the third: 


(a) Aisa code, 
(b) A is complete, 
(c) m(A)= 1. 


Proof. (a, b) = (c). If A is a complete code, then m(A) = 1 by Proposition 11.4.6 
and McMillan’s inequality. (b, c) = (a). If m(A) = 1, but A is not a code, then 


for some n.m(A") < m(A)" = 1, hence A” is not complete, and so neither is A. 
(c, a) => (b). If m(A) = 1 and A is a code, then A is a maximal code and so it is 
complete, by Theorem 11.4.7. + | 


The situation is reminiscent of what happens for bases in a vector space. A basis is 
a linearly independent spanning set, and of the following conditions on a set A of 
vectors In a vector space V any two imply the third: 


(a) A is linearly independent, 
(b) A is a spanning set, 
(c) |A| = dim Y. 


In the proof the exchange axiom plays a vital role: it has the consequence that any 
spanning set contains a linearly independent set which still spans. The analogue 
here is false; there are minimal complete sets that are not codes. For example, take 
X = {x.y}, A = (x? xc yx. xy, yx. y}. It is easily verified that A is minimal complete. 
The subset A) = {x°. x"yx, yx. y} is not a code and m(A,) = 1, while all other proper 
subsets of A are codes. What can be shown is the following (see Boé, de Luca and 
Restivo [1980}): 
A minimal complete set 1s a code iff all its proper subsets are codes. 


11.5 Free algebras and formal power series rings 419 


For finite codes the converse of Theorem 11.4.7 holds: every finite complete code 
is maximal (this follows from Theorem 11.4.8 and Proposition 11.4.4), but there are 
infinite complete codes which are not maximal (see Exercise 5); this also shows that 
Theorem 11.4.8 does not extend to infinite sets. 


Exercises 


1. Determine which of the following are codes: (i) {xy. xy’. 7}, (ii) {x xy. x-y, 
xy? yn}, Gil) Ge xy ry sy? pxd. 

2. Construct a code for r = 2 and the sequence 1, 2,3..... Likewise for r = 4 and 
the sequence T, [i l..2..2,2.3.3; 3y124 

3, Let X be an alphabet, Ga group and f : X* — Ga homomorphism. Show that for 
any subgroup H of G, Hf ~' is a free submonoid of X*, and its generating set is a 
maximal code which is bifix (i.e. prefix and suffix). Such a code is called a group 
code. 

4. Let X be an alphabet and w’,(u) the x-length of u € X*. Show that for any integer 
m the mapping f : ui» w,(u) (mod m) isa homomorphism from X”* to Z/m and 
describe the group code Of~!. Show that for m=2, X= {x,y}, 
Of ~' = {y} U {xy*x}; find Og~! where g : ut w,(u) (mod 3). 

5. Let X = {x.y}. Show that 6(4) = w,(u) — w,() is a homomorphism to Z. 
Describe the corresponding group code D (this is known as the Dyck code). 
Show that D is complete and remains complete if one element is omitted. 

6. Let f : X* — Y”* be an injective homomorphism of free monoids. Show that if A 
is a code in X*, then Af is a code in Y*; if B is a code in Y*, then Bf ~' is a code 
in X*. 

7. Let A, B be any codes in X*. Show that A” is a code for any > 1, but AB need 
not be a code. 

8. Let X = {x.y}. u(x) =p, uly) =q, where p.qg>0, p+q=1. Show that 
C = {xy, yx, xy-x} is not a code, even though it satisfies (11.4.9). 


11.5 Free algebras and formal power series rings 


There is a further way of describing languages, namely as formal power series. This 
is in some respects the simplest and most natural method. Let X be a finite alphabet 
and k a commutative field. By a formal power series in X over k we understand a func- 
tion fon X* with values in k. The value of f at u € X* is denoted by (f. 1) and f itself 
may be written as a series 


f= > fiwu. (11.5.1) 


Here (f, 1) is called the coefficient of u, in particular, (f.1) is called the constant term 


of f. 


420 Languages and automata 


Series are added and multiplied by the rules 


(f+g,u) =(f.u) + (g, 4), (11.5.2) 


(fg. u) = 2 (f. y)(g, 2). (11.5.3) 


V2=u 


Since each element of X* has only a finite number of factors, the sum in (11.5.3) is 
finite, so fg is well-defined. The set of all these power series is denoted by k((X)); it is 
easily seen to form a k-algebra with respect to these operations. For each power series 
f its support is defined as 


Dif) = {ue X*|(f.u) £0}. 


Thus u lies in the support of f precisely if it occurs in the expression (11.5.1) for f. 
The elements of finite support are called polynomials in X; they are in fact just poly- 
nomials, i.e. k-linear combinations of products of elements of X, but care must be 
taken to preserve the order of the factors, since the elements of X do not commute. 
These polynomials form a subalgebra k(X), called the free k-algebra on X (see 
Section 8.7). We remark that k(X) can also be defined as the monoid algebra of 
the free monoid X*, in analogy to the group algebra. 

For each power series f we define its order o(f ) as the minimum of the lengths of 
terms in its support. The order is a positive integer or zero, according as (f. 1) is or is 
not zero. For a polynomial f we can also define its degree d(f ); it is the maximum of 
the lengths of terms in its support. If all terms of f have the same length r, so that 
o(f)=d(f) =r, then fis said to be homogeneous of degree r. 

We remark that if u is a series of positive order, we can form the series 


eS Ta es 


The infinite series on the right ‘converges’ because if o(11) = r, then for any given 
d,u”" contributes only if rn < d. Thus in calculating the terms of degree d in 1* 
we need only consider u for n =0,1,..., {d/r]. We also note that u* satisfies the 
equations 


uu uu” = u* —1. 


Hence (1 — u)u* = u*(1 — u) = 1, so u® is the inverse of 1 — u: 
oe 
Cai arr ae (11.5.4) 
0) 


It is easily verified that k{X) has the familiar universal property: every mapping 
y: X — A into a k-algebra A can be extended in just one way to a homomorphism 
yp: k{X) — A. As a consequence every k-algebra can be written as a homomorphic 
image of a free k-algebra, possibly on an infinite alphabet. Of course free algebras 
on an infinite alphabet are defined in exactly the same way; for power series rings 
there are several possible definitions, depending on the degrees assigned to the 
variables, but this need not concern us here, as we shall only consider the case of 
a finite alphabet. 


11.5 Free algebras and formal power series rings 421 


The free algebra A(X) may be regarded as a generalization of the polynomial ring 
k|x], to which it reduces when X consists of a single element x. The polynomial ring 
is of course well known and has been thoroughly studied. The main tool is the 
Euclidean algorithm; this allows one to prove that k[x] is a principal ideal domain 
(PID) and a unique factorization domain (UFD) (see BA, Section 10.2). The UF- 
property extends to polynomials in several (commuting) variables (BA, Section 
10.3), but there is no analogue to the principal ideal property in this case. For the 
non-commutative polynomial ring k(X) the UF-property persists, albeit in a more 
complicated form, and we shall have no more to say about it here (see Exercise 3 
below and Cohn (1985), Chapter 3). The principal ideal property generalizes as 
follows. We recall from Section 8.7 that a free right ideal ring, right fir for short, 
is a ring R with invariant basis number (IBN) in which every right ideal is free as 
right R-module; left firs are defined similarly and a left and right fir is called a fir. 
In the commutative case a fir is just a PID and the fact that k[x] is a PID generalizes 
to the assertion that k(X) is a fir. This is usually proved by the weak algorithm, a 
generalization of the Euclidean algorithm, to which it reduces in the commutative 
case. We shall not enter into the details (see Cohn (1985), Chapter 2), but confine 
ourselves below to giving a direct proof that k(X) is a fir. This method, similar to 
the technique used to prove that subgroups of free groups are free, is due to Jacques 
Lewin; our exposition follows essentially Berstel and Reutenauer (1988) (see also 
Cohn (1985) Chapter 6). 


Theorem 11.5.1. Let F = k(X) be the free algebra on a finite set X over a field k and let 
a be any right ideal of F. Then there exists a Schreier set P in X* which is maximal 
linearly independent (mod a). If C is the corresponding prefix set, determined as in 
Proposition 11.4.2, then for each ¢ € C there 1s an element of a: 


f=c- Ye. pP (pe P.a@.p € k) 


where the sum ranges over all p € P, but has only finitely many non-zero terms for each 
c € C, such that a is free as right F-module on the f.(c € C) as basis. Similarly for left 
ideals, and F has IBN, so it ts a_ fir. 


Proof. The monoid X* is a k-basis of F, hence its image in F/a is a spanning set and 
it therefore includes a basis of F/a as k-space. Moreover, we can choose this basis to 
be a Schreier set, by building it up according to length. Thus if P,, is a Schreier set 
which forms a basis for all elements of F of degree at most m (moda), then the 
set P,,X spans the space of elements of degree at most 2+ 1 and by choosing a 
basis from it we obtain a Schreier set P,,., containing P,, and forming a basis 
(mod a) for the elements of degree at most n + 1. In this way we obtain a Schreier 
set P = UP,, which is maximal k-linearly independent (mod a) and hence a k-basis. 
Let C be the corresponding prefix set. For each c € C the set P U {c} is still a Schreier 
set, but by the maximality of P it is linearly dependent mod a, say 


f-=c- Soa. pea. (11.5.5) 


422 Languages and automata 


where the sum ranges over P and almost all the a, vanish. We claim that every 
b € F can be written as 


b= er: + Y— BpP. where g, € F, Bp € k, (11.5.6) 


and the sums range over C and P respectively. By linearity it is enough to prove this 
when b is a monomial. When b € P, this is clear; we need only take 6, = 1 for p = b 
and the other coefficients zero. When b ¢ P, it has a prefix in C by Proposition 
11.4.2, say b = cu, where c € C and u € F. By (11.5.5) we have 


b=cu=fut > a, ppu. (11.5.7) 


For any pe P, either pu € P or pu =c,;u, where c; € C and hence |p| < |cy], 
|1#;| < |u|. In the first case we have achieved the form (11.5.6); in the second case 
we use induction on |u| to express c;14, in the same form. Thus we can reduce all 
the terms on the right of (11.5.7) to the form (11.5.6) and the conclusion follows. 

We claim that the elements (11.5.5) form the desired basis of a. To show that 
they generate a, let us take b € a and apply the natural homomorphism F —> F/a. 
Writing the image of r as r, we have 


bay. oP 


Since the p are linearly independent by construction, we have f, = 0, so b= Sof. g, 
and it follows that the f. generate a. To prove their independence over F, assume 
that 5° fg. = 0, where not all the g, vanish. Then by (11.5.5) 


> a=). @eppee (11.5.8) 


Take a word w of maximal length occurring in some g,, say in g, . Since C is a prefix 


at 


code, c w occurs with a non-zero coefficient A on the left of (11.5.8). Hence 


A= Q ply p. 


where j2, p is the coefficient of cw in pg-. Now the relation c’w = pu can hold only 
when p is a proper prefix of c’, hence |p| < |c'|, |u| > |w] and this contradicts the 
definition of w. This contradiction shows that the f. are linearly independent over 
F, so they form a basis of a, which is therefore a free right ideal. By symmetry 
every left ideal is free, and F clearly has IBN, since we have a homomorphism 


F — k, obtained by setting X = 0. This shows F to be a fir. at 
We recall the equation X* = C~P obtained in Proposition 11.4.2. In the power 
series ring this may be written as (1 —X)~'=(1—C)~'P, where X, C, P are 


now the sums of the corresponding sets. On multiplying up, we obtain 1—C = 
P(1—X), or 


CaS =a) (11.5.9) 


This tells us again that the prefix set C consists of all products px (pe P.x EX), 
which are not in P. By Proposition 11.4.2, P is finite iff C is finite and right large, 


11.5 Free algebras and formal power series rings 423 


but in our case this just means that a is finitely generated and large as right ideal, 
while the finiteness of P means that a has finite codimension in F. By replacing 
the elements of X by 1 in (11.5.9), we thus obtain 


Corollary 11.5.2. Let F = k(X) be the free algebra as in Theorem 11.5.1 and aa right 
ideal of F. Then a has finite codimension in F if and only if a is finitely generated and 
large as right ideal. If |X| = d, a has codimension r and has a basis of n elements, then 


n—l=r(d—1). Ee (11.5.10) 


We remark that (11.5.10) is analogous to Schreier’s formula for the rank of a sub- 
group of a free group (see Section 3.4); it is known as the Schreier—Lewin formula. 

We now turn to consider the power series ring. To describe the structure of k({(X )) 
we recall that a local ring is a ring R in which the set of all non-units forms an ideal m. 
Clearly m is then the unique maximal ideal of R, and R/m is a skew field, called the 
residue class field of R. 


Proposition 11.5.3. The power series ring k{{X)) on any finite set X over a field k 1s a 
local ring with residue class field k. Its maximal ideal consists of all elements with zero 
constant term. 


Proof. The mapping X — 0 defines a homomorphism of k((X)) onto k, hence the 
kernel m, consisting of all elements with zero constant term, is an ideal in k((X)). 
It follows that k((X))/m =k, and m contains no invertible element. Any element f 
not in m has non-zero constant term 4 and so 47> 'f = 1 — 4, where o(u) > 0. By 
(11.5.4) we have (1 — uu) ' = u*, hence f=! =A7! uw’. + | 


A power series fis called rational if it can be obtained from the elements of k(X) 
by a finite number of operations of addition, multiplication and inversion of series 
with non-zero constant terms. The rational series form a subring of k({X)), denoted 
by A(X)... as we shall see in Proposition 11.5.4 below, and the method of 
Proposition 11.5.3 shows that k(X),., is again a local ring. 

Let us note that any square matrix A over k(X),,, is invertible provided that its 
constant term is invertible over k. To prove this fact, we write A= Ay — B, where 
Ay is over k and B has zero constant term. By hypothesis Ay is invertible over k 
and on writing A;'A =I — Aj; B we reduce the problem to the case where 
Av = 1. As in the scalar case we can now write 


AS) AS BY Wi SA, BY AG 


which makes sense because all the terms of Aj 'B have positive order. Strictly speak- 
ing this does not make it clear that all the entries lie in k(X); to see this we need to 
invert the entries of ] — LU’, where the terms of L’ have positive orders, term by term. 
Since the diagonal entries are units, while the non-diagonal entries are non-units, 
this is always possible. This method also yields a criterion for the rationality of 
power series. 


424 Languages and automata 


Proposition 11.5.4 (Schutzenberger). The set k(X),,, of all rational series is a sub- 
algebra of k((X)). Moreover, for any f € k((X)) the following conditions are equivalent: 


(a) f ts rational, 
(b) f = 1, ts the first component of the solution of a system 


u = Buc b, (11.5.11) 


where Bis a matrix and b is a column over k{X), B is homogeneous of degree 1 and 
b has degree at most 1, 
(c) f =u, Is a component of a system 


Fis (11.5.12) 
where F is a matrix with invertible constant term and b is a column over k{X). 


We remark that (11.5.11) can also be written as 
(I—B)u=b; (11.5.13) 


thus it has the form (11.5.12), where F now has constant term / and has no terms of 
degree higher than 1. 


Proof. (a) => (b). We shall show that the set of all elements satisfying (b) forms a 
subalgebra of k((X)}) containing k(X),,,, in which every series of order 0 1s 
invertible. It then follows that this subalgebra contains k(X),... 

It is clear that a € kK UX is the solution of u, = a. Given f, g, suppose that f = m4, 
where # is the solution of (11.5.11) and g = 1), where v is the solution of vy = Cv + ¢, 
where C satisfies the same conditions as B in (11.5.11). We shall rewrite these 
equations as (I — B)u = b, (I -—C)v =c. Then f —g is the first component of the 


solution of the system 
I-B Cie B 0 b 
( w= (L1.5.14) 
0 PG C 


where e; = (1.0..... 0)’ and B, is the first column of B; for (11.5.14) is satisfied by 
Wwe (iu; — 1). ub... Hyp. Vis... ¥y)-. In (11.5.14) the matrix on the left is not of 
the required form, but it can be brought to the form of a matrix with constant 
term 7 (and no terms of degree higher than 1) by subtracting row m+ 1 from 
row | to get rid of the coefficient | in the (1, m+ 1)-position. 

Similarly fg is the first component of the solution of 


II-B b QO () 
( )w=( ) (11.5.15) 
0 I-C C 


for (11.5.15) is satished by w = (u,v). 12Vy. 0... Um Vy. Vp. eee v,)_. If b has a non- 
zero constant term, we can bring (11.5.15) to the required form by subtracting 
appropriate multiples of row m+ 1 from row 1,..., row #1. 


It remains to invert a series of order 0 or, what comes to the same thing (after what 
has been shown), a series with constant term 1. Let f have zero constant term and 
suppose that f = 4,, where wu satisfies (11.5.11). We shall invert 1 +f by finding 


11.5 Free algebras and formal power series rings 425 


an equation for g, where (1 — g)(1 +f) = 1. We may assume that b = (0....., 0.1)! 
by writing (11.5.11) in the form 


(’ —B ) ( 4 (" 

GS ty ae 
subtracting multiples of column 2..... n — 1 from column n to reduce the constant 
term of b to 0, and adding corresponding multiples of 1 to the components of u. This 


leaves u; unchanged and it is sufficient: b,; already has zero constant term because 
this is true of #,;. Thus our system now has the form (after a change of notation) 


(I— B)u = e,. 


and here n > 1, because u, has zero constant term. Let E,,, be the matrix with (7. 1)- 
entry 1 and 0 elsewhere. Then we have 


(I—B+E,),)u—e,(l + u)). (11.5.16) 
But we can also solve the system 
C= Bb - Bs) =e. (11.5.17) 


and we can again bring the matrix on the left to the required form by subtracting the 
first row from the last, without affecting the solution. Comparing (11.5.16) and 
(11.5.17), we find that 


u=wv(l + lf] i. 


In particular, 4) =v)(1+4,), hence (1—1))(l+u,))=1+u, —u; = 1, so we 
have found a left inverse for 1 + u,. By uniqueness it is a two-sided inverse, which 
is what we had to find. This then shows that the set of components of solutions 
of systems (11.5.11) is a ring containing A(X) and admitting inversion when possible. 

(b) = (c) is clear, and to prove (c) = (a) we must show that the solution of 
(11.5.12) has rational components. We have seen that the matrix J — B has an inverse 
whose entries are rational, hence the same is true of u = (I — B) 'b. 

We have now shown (a)-(c) to be equivalent, and the elements satisfying (11.5.11) 
form a subring containing k(X}, hence a subalgebra. It follows that k(X),,, 1s a sub- 
algebra in which all series of order 0 have inverses. Cj 


With every language L on the alphabet X we associate a power series f;, its 
characteristic series, defined by 
l ifuel, 


Cit) 
Ju) 0 ifud L. 


In this way the language is described by a single element of K((X)). Our main object 
will be to characterize the regular and context-free languages. 


Theorem 11.5.5 (SchUtzenberger). A language on X is regular if and only if its 
characteristic series 1s rational. 


426 Languages and automata 


Proof. Let L be a regular language. Then L is generated by a grammar with rules of 
the form 


a->xp. ay. (11.5.18) 


moreover, each word of L has a single derivation. Let us number the variables of the 
grammar as )...., i4,,, Where #, = o, and number the terminal letters as x,,....: XA 
The rules (11.5.18) may be written as 


a=) xpB+)> y. (11.5.19) 


where the summations are over all the right-hand sides of rules with @ on the left. 
If we express (11.5.19) in terms of the u’s and x's, we find 


ut, = > bite, + 8: (11.5.20) 


to generate the language we replace o = 1; by the right-hand side of (11.5.20) and 
continue replacing each u, by the corresponding right-hand side. We thus obtain 
power series f)...., f, such that u, = f| satisfies (11.5.20), and f; is the characteristic 
series of L; clearly f; is rational. 

Conversely, if the characteristic series for L is rational, then it is given as a com- 
ponent of the solution of a system of the form (11.5.20), where the 6,, are linear 
homogeneous in the x; and the b, are of degree at most 1. We take the grammar 
of L in the form x,  x,u, if x, occurs in bj; and u; > | if bj has a non-zero constant 
term. This ensures that the derivations give precisely the words of L, thus L is 
regular. ce 


For example, consider the language {xy'"}, with grammar o — oy" x. Its character- 
istic series is obtained by solving the equation 1 = uy +x, Le. w=xl(l-y) = 
ie 

Next we consider the problem of characterizing context-free languages. For this 
purpose we define another subalgebra of the power series ring. An element f of 
k((X)) is said to be algebraic if it is of the form f = @+1,, where aw € k and u; 1s 
the first component of the solution of a system of equations 


PS OG ee Techie, Gorey, 


where g; is a (non-commutative) polynomial in the u’s and x’s without constant 
term or linear term in the u's. The set of all algebraic elements is denoted by 
K(X) sj; We shall show that it is a subalgebra of k((X)). 


Proposition 11.5.6. Amy system (11.5.21), where w is a polynomial without constant 
term or linear term in the u's, has a unique solution in k((X)) with components of 
positive order, and the set K(X), of all such elements is a subalgebra and a local 
ring, each element of order zero being invertible. 


Proof. Writing 1.’ for the component of degree v of u;, we find by equating homo- 
geneous components in (11.5,21), 


11.5 Free algebras and formal power series rings 427 
Here vy,’ is the sum of all terms of degree v in y,. By hypothesis, for any term To 
occurring in y."’ we have yz < 1, so the components of u;"’ are uniquely determined 
in terms of the ine with « < v, while ha = 0, again by hypothesis. Thus (11.5.21) 
has a unique solution u; of positive order in the x’s. 

If uj = y (ux) (i= 1.2... t), v = Wu. x) (j= 1....,) are two such systems, 
then to show that uw, — 1), 1%)", Do uy = (1 -— 1) 7 ' are algebraic we combine the 
above systems of equations for u,. v, with the equations w= 9, — Wy. W= gh), 
w=, +) respectively. This shows that we have a subalgebra. Moreover. the 
elements of order 0 are invertible, so we have a local ring. ee 


It is clear that we have the inclusions 
KIX) © MX pan © KX )ate © KX): (1.5.22) 


that the inclusions are strict is easily seen, by considering the case where X consists of 
a single letter. 
We now come to the promised characterization of context-free languages: 


Theorem 11.5.7 (Schutzenberger). A language L in an alphabet X is context-free if 
and only if its characteristic series is algebraic. 


Proof. Let L be context-free, generated by a grammar with the rules uv, > w,, where 
wi, 1s a Word in the w’s and x’s, and u; = a is the sentence symbol. If there is a rule of 
the form u, — u,, we replace it by the rules u, — f, where fruns over the right-hand 
sides of all the rules u, + w,. We now write 


i = OAULX): (11.5.23) 


where y,(u, x) is the sum of the right-hand sides of all the rules with wu, on the left. 
On solving the system (11.5.23) we obtain for u, the series f,, hence f; is algebraic. 

Conversely, assume that f; is algebraic, given by u;, where u is the solution of 
(11.5.23). Then the language L is obtained by applying all the rules u, > w, where 
uw runs over all the words in the support of y,(1.x); hence L is context-free, as we 
had to show. 

To give an example, the language {x"y"} has the characteristic series )° x"V", 
which is obtained by solving the equation 


w= |] + xuy’. 


Exercises 


1. Adapt the proof of Theorem 11.5.1 to the case of an infinite alphabet. 

2. Verify that the prefix set associated with a right idea] of finite codimension in a 
free algebra is a maximal code. To what extent does the finiteness condition of 
Corollary 11.5.2 apply to general (non-free) k-algebras?¢ 

3, Factorize the element xyzyx + xyz + zyx t+axyxt+xtz of F = k(x.y,2) in all 
possible ways. (It can be shown that any two complete factorizations of an 


428 Languages and automata 


element of F have the same number of terms, and these terms can be paired off in 
such a way that corresponding terms are ‘similar’, where a, b are similar iff 
F/aF = F/bF.) 

4, Show that in R(x.) the element xy"x + xy + yx +. x° + 1 is an atom, but it does 
not remain one under extension of R to C. 

5. Show that every CF-language on one letter is regular. 

Show that the inclusions in (11.5.22) are strict. 

7. Define the Hankel matrix of a power series f as the infinite matrix H(f ) indexed 
by X*, whose (1. v)-entry is (f, 41). Show that f is rational iff H(f) has finite 
rank. 


m 


Further exercises on Chapter 11 


1. Let A be an infinite set and M(A) the set of all injective mappings of A into itself 

such that the complement of the image is infinite. Show that M(A) is a semi- 

group admitting right cancellation and right division (writing mappings on 
the right), i.e. given a. B € M(A), the equation ax = B has a unique solution. 

Show that two elements of a free monoid commute iff they can be written as 

powers of the same element. 

3. Let F be a free monoid. Show that if wv = vw, then there exist a, b € F such that 
u = ab, w= ba, v = (ab)’a = a(ba)' for some r > 0. 

4. Let C be the monoid on a, b as generating set with defining relation ba = 1. 
Show that each element of C can be written uniquely as a’b*,r.s > 0. (C is 
called the bicyclic monoid. Hint. Define C as set of mappings of N- into itself 
by the rules (#1, #)a = (m,n —1) if n> 1, (m.0)a = (24+ 1.0), (ni n)b = 
(71,n+1).) 

5. Show that any CF-language can be generated by a grammar whose rules are all of 
the form a — fy ora—> x (a, Bf. y € V, x € X). (Hint. Use induction on the 
lengths of the right-hand sides of all rules not of this form; this is called the 
Chomsky normal form.) 

6. A grammar 1s called self-embedding if it includes a derivation w — uav, where 
uv #1 anda V. Show that a language L is regular iff there is a CF-grammar 
generating L which is not self-embedding. 

7. Show that every language of type 0, 2 or 3 is closed under substitution: if L has 
type v( = 0. 2.3) with alphabet X = {x).....: x,} and x, is replaced by a lan- 
guage of type with an alphabet Y disjoint from X, the resulting language is 
of type v on X UY. A CS-language is closed under substitution provided that 
the language substituted is proper. 

8. Show that a CF-language satisfies the following strengthening of the pumping 
lemma (sometimes called the iteration lemma): If L is context-free, there exists 
an integer p such that for any word w of length |w| > p and any partition 
(V),.--. v5) of |w| such that +3 > 0 and either 1 >0 or 1, > 0, there is a 
factorization w= 1)...¥5 with fi] = vj and vyjuiv3vivs € L for all n. Use the 
result to verify that L = {a*bc}U{a?ba"ca"|p prime, n > 0} is not context- 
free but satisfies the conditions of the pumping lemma (see Berstel (1979)). 


tu 


11.5 Free algebras and formal power series rings 429 


9. Show that the generating set of a maximal free submonoid of a free monoid is a 
maximal code. 

10. Show that a subset C of X~ is a code iff the submonoid generated by C has the 
characteristic series (1 — C)~'. 

11. Verify that the power series ring on X may be interpreted as the incidence 
algebra of X* when the latter is ordered by right divisibility (see BA, Section 5.6). 
(Hint. Define f,,,. = (f.z) if u = zv and 0 otherwise.) Hence deduce Proposition 
11.5.3 from the fact that the matrix in the incidence algebra is invertible iff all its 
diagonal elements are invertible (see BA, Proposition 5.6.1). 


Bibliography 


This is primarily a list of books where the topics are pursued further (and which were 
often used as sources). A second list contains articles having a bearing on the text as 
well as those referred to in the text. References in the text are by name and date, the 
latter enclosed in round brackets for books and square brackets for papers. 


I. Books 


Albert, A.A. (1939), Structure of Algebras, AMS Colloquium Publications 24, 
Providence, RI. 

Arbib, M.A. (1969), Theories of Abstract Automata, Prentice-Hall, Englewood Cliffs, 
NJ. 

Artin, E. (1957) Geometric Algebra, Interscience, New York. 

Artin, E. (1965), Collected Papers, Addison-Wesley, Reading, MA. 

Barbilian, D. (1956), Teoria Aritmetica a Idealilor (in inele necomutative), Ed. Acad. 
Rep. Pop. Romine, Bucharest. 

Barwise, J. (Ed.) (1977), Handbook of Mathematical Logic, North-Holland, 
Amsterdam. 

Bass, H. (1962), The Morita Theorems, Oregon Lectures. 

Bass, H. (1968), Algebraic K-theory, Benjamin, New York. 

Bell, J.L. and Slomson, A.B. (1971), Models and Ultraproducts: An Introduction, 
North-Holland, Amsterdam. 

Berstel, J. (1979), Transductions and Context-free Languages, Teubner, Stuttgart. 

Berstel, J. and Perrin, D. (1985), Theory of Codes, Academic Press, New York. 

Berstel, J. and Reutenauer, C. (1988), Rational Series and their Languages, Springer, 
Heidelberg. 

Bourbaki, N. (1974), Algebra I, Chapters 1-3, Addison-Wesley, Reading, MA. 

Bourbaki, N. (1990), Algebra II, Chapters 4-7, Springer, Heidelberg. 

Burnside, W. (1911, 1955), Theory of Groups of Finite Order, Dover, New York. 

Chevalley, C. (1951), Introduction to the Theory of Algebraic Functions of One 
Variable, AMS Math. Surveys 6, New York. 

Cohn, P.M. (1966), Morita Equivalence and Duality, Queen Mary College Math. 
Notes. 


432 Further Algebra and Applications 


Cohn, P.M. (1977), Skew Field Constructions, LMS Lecture Note Series 27, 
Cambridge University Press. 

Cohn, P.M. (1981) Universal Algebra, 2nd edn. Reidel, Dordrecht. 

Cohn, P.M. (1985), Free Rings and their Relations, 2nd edn., LMS Monographs 19, 
Academic Press, London. 

Cohn, P.M. (1991), Algebraic Numbers and Algebraic Functions, Chapman & Hall. 

Cohn, P.M. (1994), Elements of Linear Algebra, Chapman & Hall, London. 

Cohn, P.M. (1995), Skew Fields, Theory of General Division Rings, Vol. 57, Encyclo- 
pedia of Mathematics and its Applications, Cambridge University Press. 

Cohn, P.M. (2000), Introduction to Ring Theory, SUMS, Springer, London. 

Conway, J.H. and Sloane, N.J.A. (1988), Sphere Packings, Lattices and Groups, 
Grundlehren d. math. Wiss. 290, Springer, Berlin. 

Curtis, C.W. and Reiner, I. (1981) Methods of Representation Theory J, John Wiley 
& Sons, New York. 

Davis, M. (1958), Computability and Unsolvability, McGraw-Hill, New York. 

Dickson, L.E. (1901, 1958), Linear Groups, with an Exposition of the Galois Field 
Theory, Dover, New York. 

Draxl, P. (1983), Skew Fields, LMS Lecture Note Series 83, Cambridge University 
Press. 

Eilenberg, S. (1974-78), Automata, Languages and Machines, A-C, Academic Press, 
New York. 

Eisenbud, D. (1995), Commutative Algebra, with a View to Algebraic Geometry, 
Springer-Verlag, New York. 

Faith, C. (1981) Algebra I: Rings, Modules and Categories, Grundlehren d. math. 
Wiss. 190, Springer, Heidelberg. 

Feit, W. (1982), The Representation Theory of Finite Groups, North-Holland, 
Amsterdam. 

Goodearl, K.R. and Warfield Jr., R.B. (1989), An Introduction to Non-commutative 
Noetherian Rings, LMS Student Texts 16, Cambridge University Press. 

Gorenstein, D. (1982), Finite Simple Groups, Plenum, New York. 

Hall Jr., M. (1959), The Theory of Groups, Macmillan, New York. 

Hartshorne, R. (1977), Algebraic Geometry, Graduate Texts in Math. 52, Springer, 
Heidelberg. 

Herman, G.T. and Rozenberg, G. (1975), Developmental Systems and Languages, 
North-Holland, Amsterdam. 

Herstein, I.N. (1968), Noncommutative Ring Theory, Carus Math. Monographs 15, 
Math. Association of America. 

Herstein, I.N. (1976), Rings with Involution, Chicago Lectures in Math., Chicago 
University Press. 

Hill, R. (1985), Introduction to Coding Theory, Oxford University Press. 

Hilton, P.J. and Stammbach, U. (1971), A Course in Homological Algebra, Graduate 
Texts in Math. 4, Springer, Heidelberg. 

Huppert, B. (1967), Endliche Gruppen I, Grundlehren d. math. Wiss. 134, Springer, 
Berlin. 

Jacobson, N. (1953), Lectures in Abstract Algebra H], Linear Algebra, Van Nostrand, 
New York. 


Bibliography 433 


Jacobson, N. (1956, 1964), Structure of Rings, AMS Colloquium Publs. 37, 
Providence, RI. 

Jacobson, N. (1985, 1989), Basic Algebra I, II (2nd edn.) Freeman, San Francisco. 

Jacobson, N. (1996), Finite-dimensional Division Algebras over Fields, Springer, 
Berlin. 

James, G. and Kerber, A. (1981), The Representation Theory of the Symmetric 
Group, Vol. 16, Encyclopedia of Mathematics and its Applications, Addison- 
Wesley, Reading, MA. 

Jategaonkar, A.V. (1986), Localization in Noetherian Rings, LMS Lecture Note Series 
98, Cambridge University Press. 

Lallement, G. (1979), Semigroups and Combinatorial Applications, John Wiley & 
Sons, New York. 

Lam, T.Y. (1978), Serre’s Conjecture, Lecture Notes in Math..635, Springer, Berlin. 

Lint, J.H. van (1982), Introduction to Coding Theory, Graduate Texts in Math. 86, 
Springer, Berlin. 

Lothaire, M. (1983, 1997), Combinatorics on Words, Cambridge Math. Library, 
Cambridge University Press. 

Mac Lane, S. (1963), Homology, Grundlehren d. math. Wiss. 114, Springer, Berlin. 

McConnell, J.C. and Robson, J.C. (1987), Non-commutative Noetherian Rings, John 
Wiley & Sons, Chichester. 

McEliece, R.J. (1977), The Theory of Information and Coding, Encyclopedia of 
Mathematics and its Applications, Addison-Wesley, Reading, MA. 

Mitchell, B. (1965), Theory of Categories, Academic Press, New York. 

Peano, G. (1889), Arithmeticas Principia, Novo Methodo Exposito, Torino. 

Pierce, R.S. (1982), Associative Algebras, Graduate Texts in Math. 88, Springer, 
Heidelberg. 

Procesi, C. (1973), Rings with a Polynomial Identity, Dekker, New York. 

Reiner, I. (1976), Maximal Orders, LMS Monographs 5, Academic Press, London. 

Robinson, A. (1963), Introduction to Model Theory and the Metamathematics of 
Algebra, North-Holland, Amsterdam. 

Rowen, L.H. (1980), Polynomial Identities in Ring Theory, Academic Press, New 
York. 

Rowen, L.H. (1988), Ring Theory I, I, Academic Press, New York. 

Schofield, A.H. (1985), Representations of Rings over Skew Fields, LMS Lecture 
Notes 92, Cambridge University Press. 

Serre, J.-P. (1967, 1971), Representations Linéaires des Groupes Finis, Hermann, 
Paris. 

Stroyan, K.D. and Luxemburg, W.A.J. (1976), Introduction to the Theory of Infini- 
tesimals, Academic Press, New York. 

Weber, H. (1894, 1896, 1906), Lehrbuch der Algebra I-III (1960 reprint), Chelsea, 
New York. 

Weil, A. (1967), Basic Number Theory, Grundlehren d. math. Wiss. 144, Springer, 
Berlin. 

Weyl, H. (1939), The Classical Groups, Princeton University Press (2nd edn. 1946). 


434 Further Algebra and Applications 


ll. Papers 


Amitsur, S.A. [1965], Generalized polynomial identities and pivotal polynomials, 
Trans. Amer. Math. Soc. 114, 210-226. 

Amitsur, S.A. [1966], Rational identities and applications to algebra and geometry, 
J. Algebra 3, 304-359. 

Amitsur, S.A. [1972], On central division algebras, Isr. J. Math. 12, 408-420. 

Auslander, M. and Goldman, O. [1960], The Brauer group of a commutative ring, 
Trans. Amer. Math. Soc. 97, 367-409. 

Azumaya, G. [1950], Corrections and supplementaries to my paper concerning 
Krull-Schmidt’s theorem, Nagoya Math. J. 1, 117-124. 

Bass, H. [1960], Finitistic dimension and a generalization of semiprimary rings, 
Trans. Amer. Math. Soc. 95, 466-488. 

Bergman, G.M. [1974a], Modules over coproducts of rings, Trans. Amer. Math. Soc. 
200, 1-32. 

Bergman, G.M. |1974b], Coproducts and some universal ring constructions, Trans. 
Amer. Math. Soc. 200, 33-88. 

Bergman, G.M. [1978], The diamond lemma in ring theory, Adv. in Math. 29, 178- 
218. 

Boé, J.M., de Luca, A,. and Restivo, A. [1980] Minimal completable sets of words, 
Theor. Comput. Sci. 12, 325~332. 

Chase, $.U. [1960], Direct products of modules, Trans. Amer. Math. Soc. 97, 457— 
473. 

Chevalley, C. [1955], Sur certains groupes simples, Tohoku Math. J. 7, 14-66. 

Cohn, P.M. [1956], Embeddings in semigroups with one-sided division, J. London 
Math. Soc. 31, 169-181. 

Cohn, P.M. [1961], Quadratic extensions of skew fields, Proc. London Math. Soc. (3) 
Lt, 531-556. 

Cohn, P.M. [1966], On the structure of the GL» of a ring, Publ. Math. [HES, 30, 5— 
59: 

Cohn, P.M. [1987], Valuations in free fields, in ‘Algebra, some current trends’, Proc. 
Varna 1986, eds. L.L. Avramov and K.B.Tchakerian, Springer Lecture Notes in 
Math. 1352, 75-87. 

Cohn, P.M. [1989], The construction of valuations on skew fields, J. Indian Math. 
Soc. 54, 1-45. 

Dieudonne, J. [1943], Les determinants sur un corps non-commutatit, Bull. Soc. 
Math. France 71], 27-45. 

Gelfand, I.M. and Retakh, V. [1997], Quasideterminants I, Sel. math. New ser. 3, 
517-546. 

Goldie, A.W. [1958], The structure of prime rings under ascending chain conditions, 
Proc. London Math. Soc. (3) 8, 589-608. 

Hall, P. [1928], A note on soluble groups, J. London Math. Soc. 3, 89-105. 

Hall, P. [1937], A characteristic property of soluble groups, J. London Math. Soc. 12, 
198-200. 

Henkin, L. [1960], On mathematical induction, Amer. Math. Monthly 67, 323-338. 


Bibliography 435 


Landweber, P.S. [1963], Three theorems on phrase-structure grammars of type 1, 
Inform. Control 6, 131-136. 

Merkuryev, A.S. and Suslin, A.A. [1986], On the structure of Brauer groups of fields, 
Math. USSR Izvestiya 27, 141-155. 

Nagata, M. [1957], A remark on the unique factorization theorem, J. Math. Soc. 
Japan, 9, 143145. 

Newman, M.H.A. [1942], On theories with a combinatorial definition of “equiva- 
lence’, Ann. of Math. 43, 223-243. 

Ore, O. [1931], Linear equations in non-commutative fields, Ann. Math. 32, 463- 
477. 

Roganov, Yu.V. [1975], The dimension of a tensor product on a projective bimodule 
(Russian), Mat.Zametki 18, 895-902. 

Sasiada, E. and Cohn, P.M. [1967], An example of a simple radical ring, J. Algebra 5, 
373-377. 

Schofield, A.H. [1985], Artin’s problem for skew field extensions, Math. Proc. Camb. 
Phil. Soc. 97, 16. 

Shannon, C.E. [1948], A mathematical theory of communication, Bell Syst. Tech. J. 
27, 379-423, 623-656. Reprinted in Slepian (ed.) Key Papers in the Development 
of Information Theory, 1974, IEEE Press, New York. 

Sierpinski, W. [1945], Sur les fonctions de plusieurs variables, Fund. Math. 7, 33, 
169-173. 

Smoktunowicz, A. [2002], A simple nil ring exists, Comm. Alg. 30(1), 27-59. 

Webber, D.B. [1970], Ideals and modules in simple Noetherian rings, J. Algebra 16, 
239-242. 


List of Notations 


The number indicates the page where the term is first used or defined. Terms used 
only or mainly in one location are not included. When no page number is given, the 
term is defined in BA. 


Number systems 

N the natural numbers, 24 
Ny the natural numbers with 0 
Zim the numbers mod m 

Z the integers 2 

Q the rationa] numbers 

R the real numbers 

C the complex numbers 


Set theory 


'@) the empty set 

[X| the cardinal of the set X 

P(X) the power set (set of all subsets) of X 6 
AY the complement of Yin X xi 

Y* or Map(X, Y) the set of all mappings from X to Y, 2 
Map(X) set of all mappings of X into itself, 4 
ker f kernel of a correspondence f, 5 

Np aleph-null, the cardinal] of N 
Number theory 

max(a, b) the larger of a and b 

min (a. b) the smaller of a and b 


al\b a divides b 


438 Further Algebra and Applications 


(a, b) highest common factor (HCF) of a and b 
[a, b] least common multiple (LCM) of a and 6 
[x] greatest integer not exceeding x 

bi; Kronecker delta 

pH) Mobius function 

y(t) Euler function 


Group theory 


Sym,, or S,, symmetric group of degree n 

Alt, or A,, alternating group of degree n 

sen o sign of the permutation o 

Sty stabilizer of the point p, 119 

C, cyclic group of order n 

D,,, dihedral group of order 2m, 93 

(G:H) index of H in G 

(x.y) =x ~!y7'xy commutator of x and y 

G' commutator subgroup (derived group) of G 
Aut(G) automorphism group of G 

Inn(G) group of inner automorphisms of G 

N<«aG N is a normal subgroup of G 

A»xG semidirect product, 93 

N (A) normalizer of H in G 

GL,,(R) general linear group over a ring R116 
E,,(R ) subgroup generated by elementary matrices 
U,,(C) unitary group over C, 242 

SPon(K ) symplectic group over K, 121 


O,(K ), $O,(K ) orthogonal group over K, 126 
K, (A), SK; (A) Whitehead group of A, 196 


Rings and modules 


mys space of all m x m matrices over V 

BO se ONES) space of m-component column vectors over V 
Vite" v") space of m-component row vectors over V 
MM, (R) or R,, n xX mn matrix ring over R 

fore matrix unit, 116 

B, (a) elementary matrix, 116 

diag(d;, .d,) diagonal matrix, 27] 

Hom(U. V) set of all homomorphisms from U to V 
End(U) ring of all endomorphisms of U 

USQV tensor product of U and V 

tM torsion submodule of M 


p(a) rank of an endomorphism a, 310 


List of Notations 439 


P(M) projective cover of M, 141 

T(M) top of M, 141 

Z(M) singular submodule of M, 286 

Las Pa left, right multiplication by a, 179 

R° opposite ring of R 

J(R) Jacobson radical of R 

Ar algebra obtained from A by extending the ground field to E 
B, Brauer group of the field k, 188 

B(F/k) relative Brauer group, 206 

Deg A degree of A, 189 

(a. b.k), (a,b: k|  quaternion algebra, 201 

(F/k, a. @) cyclic algebra, 215 

R* set of all non-zero elements in an integral domain 
Re abelianization of R, 80 

A! augmented algebra, 323 

U(A) group of units of A, 118 

Ann(X ) annihilator of X 

R{x] polynomial ring in x over R 

Rix; 0, a| skew polynomial ring, 276 

R[ [x3 a) | formal power series ring in x over R, 279 
R((x; @)) ring of formal Laurent series, 280 

A, [K] Wevl algebra, 278 

k(X ) free k-algebra on a set X, 78 

k((X )) free power series k-algebra on a set X 
T,(U) tensor A-ring over U, 78 

xe free monoid on a set X, 396 

x? earay 


Field theory 


[V:k] dimension of V over k 

Ne; (X) norm of x from E to F 

Tg/F(X) trace of x from E to F 

d(f ) degree of polynomial f, 275 

F (R) field of fractions of an integral domain R 
F, field of q elements 

Q,. field of p-adic numbers 

H Hamilton quaternions, 202 


Category theory 


Ob(-s/) class of all A-objects 
A(X, Y) set of all maps from X to Y 
Ens category of sets 


Modp 
sMod, 
A[|[B 
A[|B 
A[[B 
lim_, (G) 
lim (G) 
H,,(G, A) 
A"(G, A) 
x(M) 
Ext"(A, B) 
Tor, (A, B) 


Further Algebra and Applications 


category of groups 

category of abelian groups 
category of rings 

category of right R-modules 
category of (S,R)-bimodules 
product of A and B, 34 
coproduct of A and B, 35 
biproduct of A and B, 35 
direct limit, 46 

inverse limit, 46 
homology group, 96 
cohomology group, 96 
Euler characteristic, 72 
Ext functor, 73 

Torsion functor, 75 


Author index 


Albert, Abraham Adrian (1905-72) 189, 
204, 218 

Amitsur, Shimshon Avraham (1921-94) 
204, 293, 300, 303f., 306, 320, 337, 357 

Andrunakievich, Vladimir Aleksandrovich 
(1917—) 333 

Artin, Emil (1898-1962) 180, 197, 345, 365 

Artin, Michael (1934-) 336 

Auslander, Maurice (1926-94) 159, 163 

Azumaya, Goro (1920—) 139, 159, 181, 336 


Baer, Reinhold (1902-79) 53, 331] 

Bass, Hyman (1932-—) 142, 144 

Bergman, George Mark (1943—) 19, 318 

Bezout, Etienne (1730-83) 270, 342 

Birkhoff, Garrett (1911—96) 10, 17 

Boé, J.M. 418 

Bose, Raj Chandra (1901-87) 388 

Bourbaki, Nicolas (190§f/—) 312 

Brauer, Richard Dagobert (1901-77) 159, 
185, 188, 344 

Bruhat, Francois (1929—-) 349 

Burnside, William (1852-1927) 109, 259f., 
328 


Capelli, Alfredo (1858-1916) 302 

Cartan, Henri Paul (1904—) 129, 344 

Cayley, Arthur (1821-95) 346, 397 

Chase, Stephen Urban (1932—) 165 

Chevalley, Claude (1909-84) 116, 199, 309, 
36] 

Chomsky, Noam (1928—) 399f., 428 

Clifford, Alfred Hoblitzelle (1908-92) 
258 

Cohn, Paul Moritz (1924—) 20, 200, 325, 
364¢. 

Conway, John Horton (1937-) 392 


Dedekind, Richard (1831-1916) 205 

Dicks, Warren (1947—) 83 

Dickson, Leonard Eugene (1874-1954) 116, 
130, 219 

Dieudonne, Jean Alexandre (1906—92) 
129f., 328, 347, 351 

Dilworth, Robert P. (1914-93) 296 

Dirichlet, Peter Gustav Lejeune (1805-59) 
26 

Draxl, Peter K. (1944-83) 197, 282, 349 

Dyck, Walther van (1856~1934) 419 


Eckmann, Beno (1917-) 55 
Eilenberg, Samuel (1913-98) 79, 88, 150 
Euler, Leonhard (1707-83) 72 


Fatou, Pierre Joseph Louis (1878-1929) 
307 

Feit, Walter (1930—) 105 

Fisher, James L. 358 

Fitting, Hans (1906-38) 136 

Formanek, Edward F. (1942-—) 30] 

Freyd, Peter J. (1936-) 53 

Frobenius, Ferdinand Georg (1849-1917) 
202, 229, 231, 237, 247, 256, 260f. 


Galois, Evariste (1811-32) 6 

Gauss, Carl Friedrich (1777-1855) 360 

Gelfand, Izrail Moiseevich (1913—) 353 

Gilbert, Edgar N. (1923-—) 375 

Gleason, Andrew Mattei (1921—) 393 

Golay, M.J.E. (1902—) 392 

Goldie, Alfred William (1920—) 269, 282, 
288f. 

Goldman, Oscar (1925-86) 159 

Golod, Evgenti Solomonovich (1935-) 
332¢. 


442 


Goppa, V. D. 389 

Green, James Alexander (1926—) 139, 264 
Greibach, Sheila 403 

Grothendieck, Alexander (1928—) 47 


Hall, Philip (1904-82) 11, 102ff., 105f. 

Hamilton William Rowan (1805-65) 200f. 

Hamming, Richard Wesley (1915~98) 373, 
375, 380 

Hankel, Hermann (1839-73) 428 

Hasse Helmut (1898-1980) 194, 204 

Hattori, Akira 159 

Henkin, Leon (1921—) 1 

Herstein, Israel Nathan (1923-88) 191, 306, 
333, 346 

Higgins, Philip John (1926—) 308 

Higman, Graham (1917—) 308, 333 

Hilbert, David (1862-1943) 85, 218 

Hirata, Kazuhiko 160 

Hochschild, Gerhard P. (1916—) 169 

Hocquenghem, Alexis 388 

Holder, Otto (1859-1937) 101 

Hopt, Heinz (1894-1971) 113 

Hotzel, Eckehart 328 

Hua Loo-Keng (1910-85) 344 


Iwasawa, Kenkichi 118, 126 


Jacobson, Nathan (1910-99) 135, 276, 309, 
324, 328, 342 
Jategaonkar, Arun Vinayak 281, 318 


Kaplansky, Irving (1917—) 303f., 320, 342 
Kasch, Friedrich (1923—) 289 

Kemer, A. R. 307 

Kharchenko, Vladislav Kirillovich 218 
Koshevol, E. G. 281 

Kothe, Gottfried (1905-89) 191, 331f. 
Kraft, L.G. 416 

Krull, Wolfgang (1899-1971) 138f. 
Kupteroth, Achim 187 


Lambek, Joachim (1922—) 63 

Landweber, Peter S. 410 

Latyshev, Victor Nicolaevich 296, 298 

Laurent, Pierre Alphonse (1813~54) 279 

Leech, John (1926-92) 392 

Leibniz, Gottfried Wilhelm von 
(1646-1716) 346 

Levitzki, Jacob (1904-56) 293, 332 


Further Algebra and Applications 


Lewin, Jacques (1940—) 421, 423 
Litoff, O. 316 

Luca, Aldo de 418 

Lukasiewicz, Jan (1878-1956) 12 


Mackey, George W. (1916—) 256f. 

MacWilliams, Florence J. (1917-90) 382 

Magnus, Wilhelm (1907-90) 115 

Malcev, Anatoli Ivanovich (1909-67) 177 

Martindale Wallace S. III 306 

Maschke, Heinrich (1853-1908) 226ff., 
242 

Matlis, Eben 164 

McMillan, Brockway 415f. 

Merkuryev, Alexander S. 217 

Molien, Theodor E. (1861-194]) 241, 393 

Morita, Kuti (1915-95) 148, 156, 159 


Nagata, Masayochi (1927—) 308, 333 

Nakayama, Tadasi (1912~—64) 141, 181 

Neumann, John von (1903-57) 163, 247 

Newman, Maxwell Herman Alexander 
(1897-1984) 19 

Nielsen, Jakob (1890-1959) 133 

Noether, Amalie Emmy (1882-1935) 183f., 
266 


Ore, Oystein (1899-1968) 266ff., 276, 307 


Papp, Zoltan 164 

Patterson, C.W. 412 

Peano, Giuseppe (1858-1932) 24 
Platonov, Victor Pavlovich (1939—) 197 
Plotkin, M. (1922—) 381 

Posner, Edward C. (1933-93) 334f. 
Procesi, Claudio (1941-) 302, 336 


Quillen, Daniel Grey (1940—) 85 


Ramanujan, Srinivasa (1887-1920) 390 
Ray-Chaudhuri, Dijendra K. 388 
Razmyslov, Yuri Pavlovich (1951—) 301 ff. 
Rees, David (1918—) 326ff. 

Regev, Amitai 296f. 

Remak, Robert (1888-1942?) 139 
Restivo, Antonio 418 

Retakh, Vladimir S. 353 

Roganov, Yu. V. 83 

Rosset, Shmuel 293 

Rowen, Louis Halle 335 


Author index 


Ryabukhin, Yurti Mikhaelovich (1939-) 333 


Samuel, Pierre (1921—) 342 

Sandomierski, Frank L. 289 

Sardinas, August A. 412 

Sasiada, Edward (1924-99) 325 

Schanuel, Stephen H. (1933-) 57f. 

Schelter, William 336 

Schilling, Otto Friedrich Gerhard 
(191]—73) 359 

Schmidt, Otto Yulevich (1891—1956) 138f. 

Schofield, Aidan Harry (1957—) 365 

Schopf, A. 55 

Schreier, Otto (1901-29) 94, 112ff., 413, 
423 

Schréter, Karl 11 

Schur, Issai (1875-1941) 103, 189, 229 

Schutzenberger, Marcel-Paul (1923-96) 
417, 424f., 427 

Serre, Jean-Pierre (1926—) 218 

Shannon, Claude Elwood (1916-2001) 372 

Siegel, Carl Ludwig (1896-1981) 127 

Sierpinski, Wactaw (1882—1969) 14 

Singleton, R. C. 376 

Skolem, Albert Thoralf (1887-1963) 183f. 

Smith, Henry John Stephen (1826-83) 271 

Smoktunowicz, Agata 332 

Stallings, John R. (1935—) 159 

Staudt, Christian von (1798-1867) 346 

Suslin, Andrei A. 85, 217 


Takahashi, M. 134 


443 


Tamagawa, Tsuneo 126 

Tannaka, Tadao 197 

Thompson, John Griggs (1932—) 105, 159, 
262 

Treur, Jan (1952-) 344 

Tsen, Ch. C. 199f. 

Turing, Alan Mathison (1912-54) 403 


Vandermonde, Alexandre Theophile 
(1735-96) 252 

Varshamov, R.R. 375 

Villamayor, Orlando E. 173 


Warning, Ewald 199 

Watts, Charles E. (1928—) 88, 150 

Webber, David B. 155 

Wedderburn, Joseph Henry Maclagan 
(1882-1948) 139, 172, 186f., 217, 325 

Weyl, Hermann (1885-1955) 116, 278 

Whaples, George (1914-81) 180 

Whitehead, John Henry Constantine 
(1904-60) 196, 348 

Wielandt, Helmut (1910-2001) 106 


Yoneda, Nobuo 87 
Young, Alfred (1873-1940) 247ff. 


Zassenhaus, Hans J. (1912-91) 103 
Zelinsky, Daniel (1923-) 146, 173 


Zieschang, Heiner (1936—) 134 
Zorn, Max August (1906-93) 9 


Subject index 


AB5 axiom 47 

abelian category 38 

abelian valuation 361 

abelianization 45, 80f. 

absolutely flat ring 163 

ACC (ascending chain condition), 
Noetherian 265 

acceptor 404 

accessible 405 

acyclic complex 65, 67 

additive category, functor 35, 41 

adjoint associativity 52 

adjoint functor, pair 44 

afford 223 

algebra 1f., 322 

algebraic power series 426 

alphabet 371, 396, 399 

alternating form, matrix 12] 

alternating group 240 

ambivalent group 253 

Amitsur’s theorems 306, 320 

Amitsur-Levitzki theorem 293f. 

Andrunakievich-Ryabukhin theorem 
333 

annihilator 283, 285 

antichain 296 

arity 2 

Artin’s problem 365ff. 

Artin—Procesi theorem 336 

atom 398 

augmentation ideal, map 96 

augmented algebra 322 

automaton 403ff. 

automorphism 3 

automorphism class group 93 


averaging lemma 227 
Azumaya algebra 159, 336 


Baer (upper, lower) nilradical 332f. 
Baer product, sum 98, 206 
Baer’s criterion 53 

balanced mapping 51 

bar resolution 97 

basic ring 176 

basis 284 

BCH-code 388 

behaviour 404 

Bezout domain 270, 339 
bialternant 252 

bicentral(izer) 312 

bicyclic monoid 428 
bidimension, bidim 169 

bifix code 419 

bifunctor, balanced 72 

binary code 372ff. 

binary symmetric channel] 372 
binary tetrahedral group 218 
biproduct 35 

bit 372 

block code 373 

boundary 64, 97 

box principle 26 

Brauer class, group 188ff., 193f. 
Bruhat normal torm 349, 353 
Burnside’s theorems 109, 259f. 


cancellation monoid 397 
Capelli polynomial 302 
cardinal number 26 
carrier 2 


446 


Cartan—Brauer—Hua theorem 344 
central algebra 180 

central extension 95 

central polynomial 300 
central separable algebra 336 
centre of a category 153 
centroid 342 

CF-grammar, language 401f. 
C-field 198 

chain map, complex 64 
chain equivalence 65 
change-of-rings 53, 82 
channel capacity 373 
character 233ff., 382 
characteristic 269, 343 
characteristic class 73 
characteristic series 425 
check polynomial 385 


Chevalley’s extension theorem 361ff. 


Chevalley-Warning theorem 199 

Chomsky hierarchy 400 

Chomsky normal form 428 

classical groups 116 

clause-indicator 400 

clone of operations 13 

coaccessible 405 

co-boundary, -chain, -cycle 97 

cochain complex 64 

code 371ff., 411 (see also under the 
particular type of code) 

cofinite subset 21 

cogenerator 56, 164 

coherence 46 

coherent ring 165 

cohomological dimension, cd 59 

cohomology group 96, 169 

coimage, cokernel 37f,. 

coinduced extension, module 54 

colimit +46 

comaximal 325 

comma Category 33 

compactness theorem 23 

comparison theorem 67 

compatible 3 

complement 102, 260 

complete automaton 404 

complete subset of a monoid 417 

completely primary ring 136 

completely reducible 225 

complex 38 


Further Algebra and Applications 


complex-skew polynomial ring 278 
composition 4, 13 
congruence 8 

conical monoid 397 
conjugate 111, 144, 184 
connecting homomorphism 66 
constant operation 1 
context-free, sensitive 401 ff. 
contragredient 226 
converge 420 

Conway group 392 
copower, coproduct 34f. 
core of a matrix 349 

core of a right ideal 484 
corestriction, cor 212 
correspondence +4 

coset diagram 113 

coset leader 378 

covering 375 

crossed homomorphism 98 
crossed product 204 
CS-grammar, language 401f. 
cycle 64, 97, 247 

cychc algebra 215 

cyclic code 384 

cyclic conjugate 111 


DCC (descending chain condition), 
Artinian 179 

Dedekind’s lemma 205 

Dedekind domain 60f. 

defining relations 91 

degree function 275 

degree of a central simple algebra 189 

denominator 267 

dense action 179, 312 

dense functor 42 

density theorem 180, 309, 313 

dependence 283f. 

derivation 78, 98, 275 

derived functor 68ff. 

derived operation 13 

determinant 346ff., 351 

deterministic automaton 411 

diagonal functor +6 

diamond lemma 19, 110 

Dieudonne determinant 351 ff. 

differential 96f. 

differential module 64f. 

dihedral group 93, 240, 246 


Subject index 


Dilworth’s theorem 296 
dimension, dim 309 
direct family, limit 46 
direct power, product 3 
divisible module 54, 166 
division algebra 180 
dominate 361 

doubly even code 393 
dual code 377 

Dyck code 419 


edge 404 
Eilenberg trick 88 
elementary equivalence 29 


elementary expansion, reduction 110 


elementary matrix 116 
elementary sentence 23 
empty language 400 
endomorphism 3 

enough pro(in)jectives 54f. 
entropy 373 

enveloping algebra 168 

epic, epimorphism 37 
equivalent categories 42 
equivalent codes 377 
equivalent representations 222 
error-correcting, -detecting 374 
essential extension 32, 282 
essential homomorphism 140 
essentially unary 13 
Euchdean domain 117, 272 
Euler characteristic 72 

exact category 38 

exact functor 42 

exact homology sequence 67 
exact sequence 38 

exact, left, mght 42f. 
exponent 208 

Ext-functor 73 

extension object 39 

extension of a code 381 
extension of groups 91ff. 
exterior algebra 293 


factor set 94 

factor theorem 7 

faithful functor 42, 176 
faithful module 315 
faithful representation 222 


Feit-Thompson theorem 105 
fibre of a mapping 5 

field of fractions 268 

filter 20 

final object 33 

finitary 1 

finite intersection property 2] 
finite set 26 

finite state language 401 
finitely presented, related 160 
fir (free ideal ring) 61, 338, 421 
Fitting’s lemma 136 
Five-lemma 71 

flat module 75, 161 

formal! Laurent series 280 
formal power series 419 

free G-algebra 16 

free field 357 

free monoid 396, 41 1ff. 
Frobenius kernel, subgroup 261 


Frobenius theorems 237, 256, 26] 


full action 312 

full functor 42 

full matrix 354, 376 
fully invariant 15, 295 


Galois connexion 6 
Gaussian extension 360 
general linear group 116, 222 


447 


generalized polynomial identity, GPI 303ff. 


generating set 3, 91, 396 
generator 56, 149 

generator matrix 376 
generic division algebra 300 
generic matrix ring 299f. 
Gilbert-Varshamov bound 375 
global dimension 59, 75ff. 
Golay code 392 

Goldie ring 286 

Goldie’s theorem 288 
Goppa code 389 
GPI-theorem 305 

grammar 400 

graph of a mapping 4 
Greibach normal form +403 
group algebra 224 

group code 419 

group laws ]4 

group word 110 


448 


Hall subgroup 102 

Hall system 106 

Hall’s theorem 105f. 
Hamming bound 375 
Hamming code 380 
Hamming distance 373 
Hankel matrix, determinant 428 
HCLF highest common left factor 272 
hereditary ring 60 

Hilbert basis theorem 278 
Hilbert syzygy theorem 85 
Hochschild group 169 

hom functor 48ff. 
homogeneous 420 
homological dimension, hd 58 
homology group 64, 96, 169 
homomorphism 3 

homotopy 65f., 97 

honest mapping 357 

Hophan group 115 

Hua’s identity, theorem 344f. 
hyperbolic pair 122 


IBN (invariant basis number) 72, 338 
identity 14 

image, 1m 5, 37 

index of central simple algebra 189 
index reduction factor 190 
induced extension, module 54 
induced representation 254ff. 
induction algebra 24ff. 
inductive limit 46 

inertia lemma 340f. 
infinitesimal 132 

inflation map 100, 211 
information rate 372 

initial object 33 

injective dimension 59 
injective hull 55 

injective object, module 48 
inner rank 354 

input 403 

intertwining number 235 
invariant element 27] 
invariant subring 358 

inverse 4f. 

inverse family, limit 46f. 
invertible ideal 60 

involution 225 

irreducible representation 224 


Further Algebra and Applications 


ISBN book number 393 
isomorphic idempotents 144 
isomorphism theorems 8f. 
iteration lemma 428 


Jacobson radical 318, 324 


Kaplansky’s theorem 320 
kernel 5, 37, 232 

Kothe nilradical 331 

Kothe’s conjecture 332 
Kraft-McMillan inequality 416 
Krull-Schmidt theorem 138 


language 399ff. 

large submodule 52, 282 
law 14 

Leech lattice 392 

length 396, 404 

Levitzki radica] 332 
lifting an idempotent 143 
limit 46 

linear category. functor 50 
linear character 233 
linear code 376 

linear group | 16éff. 
Litoff’s theorem 316 
local ring 135, 423 
localization 266 

locally nilpotent 332 
loop functor 58 

lower central series 115 
lower nilradical 332, 342 


lower segment 413 


MacWilliams identity 382f. 
Magnus’ theorem 115 
mapping cone, cylinder 88 
Maschke’s theorem 226ff., 242 
McMillan inequality 415 
measure 414 

metacyclic group 10] 
modular law 91 

modular right ideal 323 
monic, monomorphism 36 
monoid 395 

monomial matrix 107, 350 
Morita context 156 

Morita equivalence 148ff. 
Morita invariant 153 


Subject index 


multilinear 291 
multiplicator 100 


Nagata-Higman theorem 308, 333 
Nakayama’s Lemma 140f. 

natural isomorphism, transformation 41f. 
natural numbers 24ff. 

Nielsen transformation 133f. 

nil(potent) ideal 330 

nilradical 330 

non-standard model 29 

noughtary | 

numerator 267 


obstruction 53 

operation |] 

operator domain 2 

optimal code 374 

order function 279 

Ore condition, set 267ff. 
Ore domain 268 

orthogonal group, representation 126 
orthogonal idempotents 142 
orthogonality relations 230 
output 403 


packing 375 

parity check (matrix) 372, 377 
partition lemma 342 

PDA, pushdown acceptor 409 
Peano axioms 24 

perfect code 375 

perfect group 118 

perfect ring 148 

permutation matrix 350 
permutation module 238 
phrase structure grammar, language 400 
PI-algebra 290 

PlI-degree 336 

PID principal ideal domain 270 
place 364 

Plotkin bound 381, 393 
polynomial 420 

polynomial identity, PI 290, 303 
Posner's theorem 335f. 

prefix notation 12 

prefix set, code 41 2ff. 
presentation matrix 160 
primary algebra 209 

prime ideal 329 


449 


prime radical 332 

prime ring 30, 270, 282, 329 
primitive idempotent 142 
primitive permutation group 119 
primitive ring 315ff. 
principal character 233 
principal ideal domain (PID) 271 ff. 
principal valuation 359 
principle of induction 24 
product 34 

profinite group 47 
progenerator 158 

projection operator 13, 34 
projective cover 140 
projective dimension 58 
projective equivalence 58 
projective limit +46 

projective linear group 118 
projective object 48 
projective resolution 57, 67 
proper language 400 
pseudolinear extension 367 
pullback of a mapping 53 
pullback of a pair 39 
pumping lemma 408 
puncturing a code 375 

pure extension, sequence 176 
pushout 39 


QR-code 390 

quasi-algebraically closed 198 
quasi-commutative valuation 363 
quasi-inverse, -regular 318 
quasideterminant 353 
quaternion algebra, group 201 ff. 
quotient algebra 7 

quotient category 151 

quotient object 37 

quotient ring 268 


radicals compared 333 

rank (free group) 111 

rank (linear mapping) 310 
rank (module) 284 

rational power series 423 
Razmyslov polynomial 302 
Razmyslov transposition 301 
reciprocity 256 

recursion principle 27f. 
reduced acceptor 405 


450 


reduced norm, trace 195ff. 
reduced ring 330 

Rees matrix algebra 326ff. 
Regev's theorem 297 

regular element 265, 288 
regular field extension 182 
regular grammar, language +01, 410 
regular matrix 326 

regular representation 224, 397 
regular ring 163 

relatively free algebra 295 
relatively injective, projective 54 
repetition code 388 
representable functor 43 
representation 221 

residually-P 10 

residue class field 135, 423 
restriction, res 100, 210 
retraction 39, 151 

reverse automaton 411 
rewriting rule 400 

right vanishing 148 

rigid monoid 397 

ring 78, 365 

Roganov's theorem 83 

row 11 

row-finite matrix 160, 310 
Rowen’'s theorem 335 


S-function 252 

sandwich matrix 326 
Schanuel’s lemma 57f., 60 
Schreier extension theorem 94 
Schreier set 413 

Schreier subgroup theorem 114 
Schreier transversal 113 
Schreier’s formula 114 
Schreier—Lewin formula 423 
Schur index 189 

Schur’s lemma 229f. 
Schur-Zassenhaus theorem 103 


Schiitzenberger theorems 417, 424. 427 


section 39 

section functor 151 

self-dual code 377 
self-embedding grammar 428 
semi-Artinian module 307 
semidirect product 93 
semifir 339 

semigroup 399 


Further Algebra and Applications 


semihereditary ring 89 
semiperfect ring 144 
semiprime ideal 330 
semiprime ring 282, 330 
semiprimitive ring 318, 324 
separable algebra 168, 336 
separable states 405 
separating idempotent, separator 169 
sequential machine 403 
Shannon’s theorem 373 
shortening a code 376 
signature of an algebra 2 
similar algebras 187 

similar elements 428 
simple algebra 7, 326 
simple module 323 
Singleton bound 376 
singular matrix 347 
singular submodule 286 
singularity support 357 
skew field 180, 343ff. 

skew Laurent polynomial 279 
skew polynomial (ring) 276 
Skolem—Noether theorem 183f. 
snake lemma 63 

socle 315 

spanning set 113, 284 
special linear group 116 
specialization lemma 355 
sphere-packing bound 375 
split exact sequence 39 
split extension 92 

splitting field 189 

stabilizer 119 

stably free module 86 
staircase lemma 294 
standard array 379 
standard polynomial 290 
standard resolution 97, 170 
State 403 

strongly nilpotent 331 
subdirect product 9 
subdirectly reducible 10 
subobject 37 
subrepresentation 224 
successful path 404 
successor function 24 
suffix 412 

support of a power series 420 
symbol 216 


Subject index 451 


symmetric algebra 81 unital module 322 
symmetric group 240, 247ff. unitary group, representation 242 
symimetrizer 250 unitriangular 349 
symmetry 127 universal code 388 
symplectic group, space 121 universal derivation bimodule 79f. 
syndrome decoding 378 universal field of fractions 354, 357 
syzygy 85f. universal language 400 
universal property 33 
T-ideal 295 upper nilradical 331 
Tannaka—Artin problem 197, 282 
tensor product 51f. valency 11 
tensor ring 78, 303 valuation 359ff. 
terminal letter 400 valuation ring 358 
three-by-three lemma 41, 87 value group 358 
top of a module I+] variable 400 
topology of pointwise convergence 313 variety 15 
torsion product, Tor 75 
torsion submodule 274 weak dimension 75f. 
total automaton 411 weakly finite ring 354 
total divisor 27] Wedderburn decomposition 
total quotient ring 268 theorem 325 
total subring 358 Wedderburn’s principal theorem 172 
totally isotropic subspace 122 Wedderburn’s theorem on finite fields 186 
transfer 107f., 212 Wedderburn-Artin theorem 309 
transgression 100 weight 376, 388 
transition function 403 weight enumerator 382f. 
transitive permutation group 113 well-ordered set 26 
translation ring 279 Weyl algebra 278, 281 
transvection 117, 122f. Whitehead group 196 
trim automaton 405 Whitehead’s lemma 348 
trivial 222, 233, 338, 359, 375 width 296 
trivial algebra 2 Wielandt’s criterion 106 
trivializable 338f. windmill lemma 41, 87 
Tsen’s theorem 199 word 371, 399 
Turing machine 395, 403 word algebra 11 


type of a language 400f. 
X-inverting homomorphism 265 
ultraproduct, -power 22, 357 


unambiguous 413 Yoneda’s lemma 87 

uniform measure 414 Young diagram, tableau 248ff. 
uniform module 164, 283 

unipotent matrix 226 zero-sum code 388 

unit 398 zero delay code 412 


l-unit 359 zero morphism, object 36 


»bra and Applications is the second volume of a new and revised 
P.M. Cohn’s classic three-volume text Algebra which is widely 
_one of the most outstanding introductory algebra textbooks. For 
. the text has been reworked and updated into two self-contained, 
volumes, covering advanced topics in algebra for second- and third- 
lraduate and postgraduate research students. 


lume, Basic Algebra, covers the important results of algebra; this 
volume focuses on the applications and covers the more advanced 
ics such as: 


yups and algebras 

mological algebra 

versal algebra 

neral ring theory 
iresentations of finite groups 
ling theory 

guages and automata 


gives a clear account, supported by worked examples, with full 
e are numerous exercises with occasional hints, and some historical 


» is an Honorary Research Fellow of University College London anda 
e Royal Society. 


‘reviews of Algebra: 
ere is no better textbook on algebra than the volumes by Cohn.” 
Professor Walter Benz, Universitat Hamburg, Germany 


that this will soon become a standard reference work in 
| can recommend this book without reservations." 
J.D.P. Meldrum, University of Edinburgh, UK 


ISBN 1-85233-66/7-6 


:233-667-6 AANA 


